Home

Algorithm Problem #1: N-Way Merge

Problem Statement

Given N ascending iterators implement an iterator that will merge all given iterators into single ascending iterator.

Stating problem semi formally will be the following:

Breakdown

Let’s breakdown problem into smaller problems

  1. There’s a requirement to present solution in form of Iterator and minimal contract on Iterator interface requires implementing next() and hasNext() methods. Therefore, correct solution needs to focus on those.

  2. Elements returned from next() should be in ascending order selecting from all input iterators. Therefore, it’s required to retrieve candidate (or head) elements from each input iterator and keep track of them to allow selecting smallest to be returned, while also remembering value origin iterator.

  3. Selecting the smallest element among candidates should be fast operation and shouldn’t require a repeated lookup among candidate values, so queueing elements according to priority (i.e. head element value) should work.

  4. Once source iterator has no more elements it should be discarded and no longer taken into account, but that shouldn’t force queue rebuild.

Solution

Start with addressing point 1 and INPUT definition by creating a generic class, that will contain a solution

public class NWayMerge<T extends Comparable<T>> implements Iterator<T> {

    public NWayMerge(List<Iterator<T>> sources) {

    }

    @Override
    public boolean hasNext() {
        return false;
    }
    
    @Override
    public T next() {
        return null;
    }
}

Next addressing point 2 we’ll need a Node class, that will pair head value of each source and source itself. So we’ll add a static inner class to solution code which will wrap a source Iterator and allow for inspection of head element. In order to allow for easy interaction with wrapped iterator and avoid exposing interal state let’s make Node implement Iterator interface by forwarding values from wrapped Iterator.

static class Node<T extends Comparable<T>> implements Iterator<T> {

    private final Iterator<T> source;
    private T nextValue;
    private boolean hasNextFlag;

    public Node(Iterator<T> sourceArg) {
        this.source = sourceArg;
        advance();
    }
    
    @Override
    public boolean hasNext() {
        return hasNextFlag;
    }

    @Override
    public T next() {
        T returnValue = head;
        advance();
        return returnValue;
    }
    
    private void advance() {
        if (source.hasNext()) {
            hasNextFlag = true;
            head = source.next();
        } else {
            hasNextFlag = false;
            head = null;
        }
    }
}

Private method advance() takes care for taking next element from source iterator and making it ready for return in Node::next() method and setting flag used by Node::hasNext() method. Note that it should be invoked in Node::new to ensure correct initial state.

Putting those together let’s update implementation of NWayMerge::new constructor and wrap all the input sources into Node classes. Note that now null guard for sources is needed and still nothing is done with created Node instances.

public NWayMerge(List<Iterator<T>> sources) {
    if (sources == null) return;

    sources.stream()
        .map(Node::new)
}

Going to address point 3 finding an efficient way of selecting the smallest element among candidates. We’ll put a PriorityQueue as a field of NWayMerge and we’ll be comparing Node for value of head field. But in order to be able to put Node instances into a PriorityQueue we need to make Node implement Comparable interface.

static class Node<T extends Comparable<T>> implements Iterator<T>, Comparable<Node<T>> {
//...    
    @Override
    public int compareTo(Node<T> other) {
        return this.head.compareTo(other.head);
    }
//...
}

Having Node implement Comparable allows to put it into a PriorityQueue, but edge case of receiving already empty iterator needs to be handled, so before actually adding nodes into queue empty ones need to be filtered out.

public class NWayMerge<T extends Comparable<T>> implements Iterator<T> {

    private final PriorityQueue<Node<T>> queue = new PriorityQueue<>();

    /**
     * @param sources List of Iterators providing values of T, assumed to be sorted in ascending order.
     */
    public NWayMerge(List<Iterator<T>> sources) {
        if (sources == null) return;

        sources.stream()
                .map(Node::new)
                .filter(Node::hasNext)
                .forEach(queue::add);
    }
//...

Now that there’s a queue of elements implementation of next() and hasNext() can be finished.

hasNext() is easy - it just checks if queue still has something to offer.

    @Override
    public boolean hasNext() {
        return !queue.isEmpty();
    }

next() is a bit more demanding and needs also to address point 4 i.e. discarding already empty iterators.

    @Override
    public T next() {
        if (queue.isEmpty()) {
            throw new NoSuchElementException();
        }

        Node<T> node = queue.remove();
        T returnValue = node.next();
        if (node.hasNext()) {
            queue.add(node);
        }

        return returnValue;
    }

Complexity Analysis

VALUES

TIME: Single call to next() costs O(lg N). Retrieving all elements from NWayMerge will cost O(K lg N)

MEMORY: NWayMerge requires O(N) additional memory for storing instances of Node and overhead of PriorityQueue instance

Complete Implementation and Tests