Algorithm Problem #2 Follow-Up #1: Random Binary Tree

Context

This is a follow-up on previous note, where unordered equality predicate for binary trees was developed. Creating property based tests for this code requires ability to generate random binary trees, which can be then used to test solution against multiple random inputs.

It’ll be implementation of Rémy algorithm, originally described by Jean-Luc Rémy in 1985. I don’t know French, so I haven’t actually used this paper when implementing code, but I’m referencing original paper for sake of citing original sources.

Instead, I’ve based my code on a paper by Erkki Mäkinen and Jarmo Siltaneva, which luckily is in English. Additionally, there’s a description of applying Chi-Squared test to empirically verifying implementation correctness with respect to providing correct random distribution. In this particular case expectation is to have uniform distribution - i.e. each one of C_n trees with N internal nodes is equally probable on output for a given N.

Last thing is that this code will only be concerned with drawing a random tree shape, leaving out any enrichment - like filling in values for keys etc. for later steps.

Tree Representation

Implementation of Unordered Tree Equality assumed that Binary Trees will be represented as having left and right subtree and containing a value called key of some generic type K, so here’s a reminder of how the contract looked like.

Tree.java

public interface Tree<K> {
    K key();
    Tree<K> left();
    Tree<K> right();
}

Remy Algorithm performs incremental construction of data structure by modifying state constructed so far, thus a mutable data structure conforming to above contract is required. It’ll be called TreeMutable and in order to avoid much boiler-plate code Lombok library will be used. Using Lombok will allow to have repetitive code like constructors, getters and setters etc. generated automatically. Additionally, fluent accessors will be generated instead of usual Java get/set convention.

TreeMutable.java

@NoArgsConstructor
@Getter
@Setter
@ToString
public class TreeMutable<K> implements Tree<K> {
    private K key;
    private TreeMutable<K> left, right;
}

Implementation - class outline

Code will follow functional convention of Java and so it’ll implement IntFunction and Function from Integer both parameterised with TreeMutable<T>.

An instance of Random will be needed and @RequiredArgsConstructor to pass it, then boilerplate code for Function<Integer, TreeMutable<T> will be needed to forward call to apply(int).

RandomTreeRemy.java

@RequiredArgsConstructor
public class RandomTreeRemy<T> implements IntFunction<TreeMutable<T>>, Function<Integer, TreeMutable<T>> {
    private final Random random;
    
    @Override
    public TreeMutable<T> apply(int internalNodes) {
        // Next paragraph will fill this in
        return null;
    }

    @Override
    public TreeMutable<T> apply(Integer integer) {
        if (integer == null) {
            throw new NullPointerException();
        }

        return apply(integer.intValue());
    }
}

Implementation - preparation

Now some preparation before algorithm main loop has to be done.

First thing is to do a sanity check for arguments, ensure that requested number of internal nodes is not negative. Note that 0 internal nodes is entirely fine - this just gives a tree composed of just one leaf node.

Then compute tree target size, which will be 2 * internalNodes + 1. Remember that algorithm will generate complete binary trees, so each node will either be a leaf or will have 2 subtrees.

Once target size is known initialize array to store the nodes, it’ll be needed to randomly pick a node at each loop iteration, then root node has to be created and put into beginning of array.

Last step is to outline loop where incremental tree construction will happen, only thing is that each step 2 nodes will be added hence loop counter will be incremented by 2, not by 1 as is usually the case.

preparation

        if (internalNodes < 0) {
            throw new IllegalArgumentException();
        }

        int limit = 2 * internalNodes + 1;
        TreeMutable<T>[] nodes;
        nodes = new TreeMutable[limit];

        // Initialize
        nodes[0] = new TreeMutable<>();

        for (int i = 1; i < limit; i+= 2) {
            // Filled in next section
        }

Implementation - main loop

First step in main loop is to draw two things from random:

Random int from 0..i-1 to choose a node from nodes created so far
Random boolean to use it to decide linking direction of new leaf and subtree

Then after picking a node from array of existing nodes two new nodes have to be created call them leaf and subtree.

Then put newly created nodes inside nodes array after already existing nodes.

Central step is rearrangement of nodes to extend the tree.

Node subtree will take children from root and both leaf and subtree will be assigned as new children of root.

Respectively as left and right or otherwise depending on value of direction.

main loop

    for (int i = 1; i < limit; i+= 2) {
        int hit=random.nextInt(i);
        boolean direction=random.nextBoolean();

        TreeMutable<T> root=nodes[hit];

        TreeMutable<T> tree=new TreeMutable<>();
        TreeMutable<T> leaf=new TreeMutable<>();

        nodes[i]=tree;
        nodes[i+1]=leaf;

        tree.left(root.left());
        tree.right(root.right());

        if(direction){
        root.left(tree);
        root.right(leaf);
        }else{
        root.left(leaf);
        root.right(tree);
        }
    }

Final step is to return root of the whole tree located at nodes[0]

end

return nodes[0];

Source Code

RandomTreeRemy.java