It’s been over 25 years since I had to code a priority queue. Should I really need to? Java now supplies a priority queue, and an important lesson of software development is that you should first seek to reuse what you can. But it’s worthwhile to know how to build such things yourself, in case you’re in an environment where one doesn’t exist.
As usual, though, I’m mostly interested in honing my ability to incrementally grow things, including interesting data structures. What can I learn from test-driving a priority queue?
All a priority queue does is return elements from a queue in priority order. Per the Wikipedia page, there are many ways to implement a priority queue, including a couple bad ways. Right off the bat, there’s no way I could force a priority queue to use a binary heap–one of the better choices–given only the client needs. Test-driving a solution and building the “simplest” possible implementation would result in the worst-performing solution.
So how to force the issue? Well, the nice thing about a binary heap is that it has a very concisely defined structure, one that I can verify easily. (I’ll refer to a binary heap as simply a heap for the rest of this article.) A binary heap is a tree where:
-
The key for each node is greater than or equal to each of its children.
-
The tree is complete–the last level is filled left-to-right, and all other levels are filled.
I know I want to end up with a heap–I need the performance–and will use that as a guiding factor in test-driving a solution. Does this violate YAGNI? Maybe. Other options? I could introduce a performance-related test, one where I tried to verify the big-O of each of the three operations (adding a new item, peeking at the highest priority item, and removing the highest priority item–the last two of which are often combined into a poll method). That would be tough to do well, and still might not guarantee an implementation that used a binary heap.
No, I need to be able to peek at the implementation. TDD will still be a great help. I still want to build the heap incrementally–there’s a decent amount of code involved, and plenty of opportunities to get it wrong. I’ll start simple, however, and not worry about heap characteristics until I get basic queue functionality working:
import org.junit.*;
import static org.junit.Assert.*;
import static org.hamcrest.CoreMatchers.*;
public class PriorityQueueTest {
private static final Object ITEM_50 = "50";
private static final Object ITEM_100 = "100";
PriorityQueue queue;
@Before
public void create() {
queue = new PriorityQueue();
}
@Test
public void emptyWhenCreated() {
assertThat(queue.isEmpty(), is(true));
}
@Test
public void noLongerEmptyAfterAdd() {
queue.add(ITEM_50, 50);
assertThat(queue.isEmpty(), is(false));
}
@Test
public void singletonQueueReturnsSoleItemOnPoll() {
queue.add(ITEM_50, 50);
assertThat(queue.poll(), is(ITEM_50));
}
@Test
public void isEmptyAfterSoleElementRemoved() {
queue.add(ITEM_50, 50);
queue.poll();
assertThat(queue.isEmpty(), is(true));
}
@Test
public void returnsFIFOWithTwoAlreadyOrderedItems() {
queue.add(ITEM_100, 100);
queue.add(ITEM_50, 50);
assertThat(queue.poll(), is(ITEM_100));
assertThat(queue.poll(), is(ITEM_50));
assertThat(queue.isEmpty(), is(true));
}
}
I won’t bore you with the rote, incremental steps used to get to this point. For now the solution is simply a class that wraps an ordered list. There’s no support for pulling things out in other than FIFO order, since there are yet no tests that demand the capability.
import java.util.*;
public class PriorityQueue {
private List<Object> heap = new ArrayList<Object>();
public boolean isEmpty() {
return heap.isEmpty();
}
public void add(Object item, int priority) {
heap.add(item);
}
public Object poll() {
return heap.remove(0);
}
}
All those tests to get to this trivial bit of code? No big deal. About five minutes of work, and I’ve gotten some good cases documented and clamped down.
The next test makes things a little more interesting:
@Test
public void returnsInPriorityOrder() {
queue.add(ITEM_50, 50);
queue.add(ITEM_100, 100);
assertThat(queue.poll(), is(ITEM_100));
assertThat(queue.poll(), is(ITEM_50));
assertThat(queue.isEmpty(), is(true));
}
The implementation now requires actually using the priority, associating it with each item, hence the introduction of the Node class. It also means I need a little bit of “real heap logic” in the add method: If the second item is higher priority than the first, sift it to the front. Since there are only two items, it’s a hard-coded swap. The way I coded it also happens to work for single-element heaps.
import java.util.*;
public class PriorityQueue {
private List<Node> heap = new ArrayList<Node>();
class Node {
final Object object;
final int priority;
Node(Object object, int priority) {
this.object = object;
this.priority = priority;
}
public boolean isEmpty() {
return heap.isEmpty();
}
public void add(Object item, int priority) {
Node node = new Node(item, priority);
heap.add(node);
if (node.priority > heap.get(0).priority)
siftUp();
}
private void siftUp() {
Node temp = heap.get(0);
heap.set(0, heap.get(1));
heap.set(1, temp);
}
public Object poll() {
return heap.remove(0).object;
}
}
Interestingly, I still don’t have a proper tree with parents and children–just a list with a couple slots filled in. In looking about at ways to implement a binary tree, the most appealing is to simply build it in-place in an array–something easily done given the constraint that the tree must be complete. Using an array will make it easy (and fast) to navigate from parent to child and vice versa.
The tests could use a bit of clarity. I add a helper method to simplify my assertion (and distill it to one line):
@Test
public void returnsFIFOWithTwoAlreadyOrderedItems() {
queue.add(ITEM_100, 100);
queue.add(ITEM_50, 50);
assertQueuePollingOrder(ITEM_100, ITEM_50);
}
@Test
public void returnsInPriorityOrder() {
queue.add(ITEM_50, 50);
queue.add(ITEM_100, 100);
assertQueuePollingOrder(ITEM_100, ITEM_50);
}
private void assertQueuePollingOrder(Object... expectedItems) {
List<Object> items = new ArrayList<Object>();
while (!queue.isEmpty())
items.add(queue.poll());
assertThat(items, is(Arrays.asList(expectedItems)));
}
Time to move on to the more interesting cases.
Thinking a bit about the next test (thinking is allowed, in fact encouraged, in TDD, particularly when determining what to do next)… Adding a third element to the heap will require a bit more generalized logic, and a fourth element (which would represent the third level of the heap’s tree) even more so. All complex enough without worrying about the complexity of removal, which requires re-heapifying.
One thing at a time. Rather than solve both challenges at once, I’ll first build out support for adding elements. Instead of using polling to verify that adding worked properly, I’ll verify that the priority queue’s underlying heap is valid.
Instead of creating a new test, I introduce a new assertion in each of my two existing tests (abusing the guideline of one assert per test).
@Test
public void returnsFIFOWithTwoAlreadyOrderedItems() {
queue.add(ITEM_100, 100);
queue.add(ITEM_50, 50);
assertQueuePollingOrder(ITEM_100, ITEM_50);
assertValid(queue.heap());
}
@Test
public void returnsInPriorityOrder() {
queue.add(ITEM_50, 50);
queue.add(ITEM_100, 100);
assertQueuePollingOrder(ITEM_100, ITEM_50);
assertValid(queue.heap());
}
private void assertValid(List<Priorityqueue.Node> heap) {
assertSubheap(heap, Integer.MAX_VALUE, 0);
}
private void assertSubheap(
List<Priorityqueue.Node> heap, int priority, int root) {
if (root > heap.size() - 1)
return;
PriorityQueue.Node node = heap.get(root);
assertThat(priority > node.priority, is(true));
assertSubheap(heap, node.priority, leftChild(root));
assertSubheap(heap, node.priority, rightChild(root));
}
private int rightChild(int index) {
return index * 2 + 2;
}
private int leftChild(int index) {
return index * 2 + 1;
}
Now I can quickly write the next failing test (with a lame name):
@Test
public void heapifiesThreeElements() {
queue.add(ITEM_50, 50);
queue.add(ITEM_75, 75);
queue.add(ITEM_100, 100);
assertValid(queue.heap());
}
… and get it to pass quickly too:
public void add(Object item, int priority) {
Node node = new Node(item, priority);
heap.add(node);
int rightmost = heap.size() - 1;
if (node.priority > heap.get(0).priority)
siftUp(rightmost);
}
private void siftUp(int rightmost) {
Node temp = heap.get(0);
heap.set(0, heap.get(rightmost));
heap.set(rightmost, temp);
}
relevant lines: 4,6,9,11,12
Another test:
~~~ java
@Test
public void heapifiesMultipleLevels() {
add(50, 75, 100, 125);
assertValid(queue.heap());
}
This time it requires a bit more manipulation of existing code. I rename
siftUp
to simply swap
, because that’s all it does, and the “sift up”
algorithm is in the body of the add
method.
public void add(Object item, int priority) {
heap.add(new Node(item, priority));
int current = heap.size() - 1;
while (priority > parentPriority(current)) {
swap(current, parent(current));
current = parent(current);
}
}
private int parentPriority(int index) {
return heap.get(parent(index)).priority;
}
private int parent(int index) {
return (index - 1) / 2;
}
private void swap(int x, int y) {
Node temp = heap.get(x);
heap.set(x, heap.get(y));
heap.set(y, temp);
}
It’s not bad, but I want each method to be at one level of abstraction,
so I modify the add
method:
public void add(Object item, int priority) {
addNewNode(item, priority);
siftUpFromLast(priority);
}
private void addNewNode(Object item, int priority) {
heap.add(new Node(item, priority));
}
private void siftUpFromLast(int priority) {
int current = heap.size() - 1;
while (priority > parentPriority(current)) {
swap(current, parent(current));
current = parent(current);
}
}
I think that’s it for add
! To confirm, I use the example from
Wikipedia:
@Test
public void integrationTest() {
add(1, 2, 3, 7, 17, 19, 25, 36, 100);
assertValid(queue.heap());
}
Oh yeah–all those test lines required to add elements were getting annoying, so I added the following helper in the test (and updated other tests correspondingly):
private void add(int... priorities) {
for (int priority: priorities)
queue.add(String.valueOf(priority), priority);
}
On to polling. The general algorithm: capture value at front of heap
(the highest priority per add
‘s post-condition), move the rightmost
item to the front, and re-heapify. To re-heapify: compare the item at
the root with both its children, swap with the larger if necessary, and
re-heapify the subtree.
From a test perspective, all we need to verify is that the highest priority item was returned and that the completeness post-condition held.
@Test
public void removeAnswersHighest() {
add(20, 30, 40, 100, 50, 60);
assertThat(queue.poll(), is((Object)"100"));
}
@Test
public void removeReheapifies() {
add(20, 30, 40);
queue.poll();
assertValid(queue.heap());
}
After incrementally introducing changes based on those two test failures, here’s the explicit bit of code that covers lists up to three elements:
public Object poll() {
Object result = heap.remove(0).object;
if (heap.size() == 2) {
if (heap.get(0).priority < heap.get(1).priority)
swap(0, 1);
}
return result;
}
The addition of another item generates the next test failure:
@Test
public void removeReheapifyMustConsiderBothChildren() {
add(40, 20, 30, 10);
queue.poll();
assertValid(queue.heap());
}
I use programming by intention to express the algorithm clearly,
starting with the poll
method:
public Object poll() {
if (heap.size() == 1)
return heap.remove(0).object;
Object result = heap.get(0).object;
reheapify();
return result;
}
private void reheapify() {
moveLastToFirstSlot();
if (priorityLessThanEitherChild(0))
swap(0, highestPriorityChild(0));
}
The reheapify
method has the odd side effect of eliminating the first
(highest) element by overwriting it. That this is not explicit in the
poll
method suggests there’s probably a better expression of the
entire algorithm.
I flesh out the trivial supporting methods:
private void moveLastToFirstSlot() {
Node rightmost = heap.remove(heap.size() - 1);
heap.set(0, rightmost);
}
private boolean priorityLessThanEitherChild(int index) {
int priority = heap.get(index).priority;
return priority < leftPriority(index) ||
priority < rightPriority(index);
}
private int highestPriorityChild(int index) {
return leftPriority(index) > rightPriority(index)
? leftChild(index)
: rightChild(index);
}
private int rightPriority(int index) {
return priority(rightChild(index));
}
private int leftPriority(int index) {
return priority(leftChild(index));
}
private int priority(int index) {
if (index > heap.size() - 1)
return Integer.MIN_VALUE;
return heap.get(index).priority;
}
private int leftChild(int index) {
return index * 2 + 1;
}
private int rightChild(int index) {
return index * 2 + 2;
}
It’s a good chunk of code, but it still fit comfortably within my five-minute cycle limit.
Hey–I recognize those leftChild
and rightChild
methods from my test!
I’ll go make ’em static so the test can re-use them.
About the only other interesting logic is rooted in the priority
method, which returns Integer.MIN_VALUE if there’s no item at an index.
That allows comparisons in priorityLessThanEitherChild
to quietly fail
when no child exists.
Almost there. The final push is to force repeated swaps from root down through multiple subtrees:
@Test
public void removeMustReheapifyMultipleLevels() {
add(50, 40, 20, 30, 10);
queue.poll();
assertValid(queue.heap());
}
Getting a passing test requires a nice, small increment of code (and a TPP transformation from if=>while. Recursion is one up in the priority list–should I refactor this method?).
private void reheapify() {
moveLastToFirstSlot();
int i = 0;
while (priorityLessThanEitherChild(i)) {
int child = highestPriorityChild(i);
swap(i, child);
i = child;
}
}
Both add
and poll
methods appear to be complete. Here’s a passing
test that fires end-to-end:
@Test
public void pollsInPriorityOrder() {
add(40, 20, 60, 80, 10, 30, 90, 70, 50);
assertQueuePollingOrder(
"90", "80", "70", "60", "50", "40", "30", "20", "10");
}
Boy, do those quotes suck. Well, it’s time to do some global clean-up, so I’ll fix that along with some other things. Here’s the test class after some various cleanup for expressiveness:
import static org.hamcrest.CoreMatchers.*;
import static org.junit.Assert.*;
import java.util.*;
import org.junit.*;
public class PriorityQueueTest {
PriorityQueue queue;
@Before
public void create() {
queue = new PriorityQueue();
}
@Test
public void emptyWhenCreated() {
assertThat(queue.isEmpty(), is(true));
}
@Test
public void noLongerEmptyAfterAdd() {
queue.add("", 50);
assertThat(queue.isEmpty(), is(false));
}
@Test
public void singletonQueueReturnsSoleItemOnPoll() {
queue.add("abc", 50);
assertThat(queue.poll(), is((Object)"abc"));
}
@Test
public void isEmptyAfterSoleItemRemoved() {
queue.add("", 50);
queue.poll();
assertThat(queue.isEmpty(), is(true));
}
@Test
public void returnsFIFOWithTwoAlreadyOrderedItems() {
add(100, 50);
assertQueuePollingOrder(100, 50);
assertValid(queue.heap());
}
@Test
public void returnsTwoUnorderedItemsInPriorityOrder() {
add(50, 100);
assertQueuePollingOrder(100, 50);
assertValid(queue.heap());
}
@Test
public void heapifiesThreeItems() {
add(50, 75, 100);
assertValid(queue.heap());
}
@Test
public void heapifiesMultipleLevels() {
add(50, 75, 100, 125);
assertValid(queue.heap());
}
@Test
public void removeAnswersHighest() {
add(20, 30, 40, 100, 50, 60);
assertThat(queue.poll(), is((Object)"100"));
}
@Test
public void removeReheapifiesTwoItems() {
add(20, 30, 40);
queue.poll();
assertValid(queue.heap());
}
@Test
public void removeReheapifyMustConsiderBothChildren() {
add(40, 20, 30, 10);
queue.poll();
assertValid(queue.heap());
}
@Test
public void removeMustReheapifyMultipleLevels() {
add(50, 40, 20, 30, 10);
queue.poll();
assertValid(queue.heap());
}
@Test
public void pollExhaustsHeapInPriorityOrder() {
add(40, 20, 60, 80, 10, 30, 90, 70, 50);
assertQueuePollingOrder(90, 80, 70, 60, 50, 40, 30, 20, 10);
}
private void assertQueuePollingOrder(Integer... expectedItems) {
List<Integer> items = new ArrayList<Integer>();
while (!queue.isEmpty())
items.add(Integer.parseInt((String)queue.poll()));
assertThat(items, is(Arrays.asList(expectedItems)));
}
private void add(int... priorities) {
for (int priority: priorities)
queue.add(String.valueOf(priority), priority);
}
private void assertValid(List<Priorityqueue.Node> heap) {
assertSubheap(heap, Integer.MAX_VALUE, 0);
}
private void assertSubheap(List<Priorityqueue.Node> heap, int priority, int root) {
if (root > heap.size() - 1)
return;
PriorityQueue.Node node = heap.get(root);
assertThat(priority > node.priority, is(true));
assertSubheap(
heap, node.priority, PriorityQueue.leftChild(root));
assertSubheap(
heap, node.priority, PriorityQueue.rightChild(root));
}
}
And here’s the implementation:
import java.util.*;
public class PriorityQueue {
private List<Node> heap = new ArrayList<Node>();
static class Node {
final Object object;
final int priority;
Node(Object object, int priority) {
this.object = object;
this.priority = priority;
}
public String toString() {
return String.valueOf(priority);
}
}
public boolean isEmpty() {
return heap.isEmpty();
}
public void add(Object item, int priority) {
addNewNode(item, priority);
siftUpFromLast(priority);
}
private void addNewNode(Object item, int priority) {
heap.add(new Node(item, priority));
}
private void siftUpFromLast(int priority) {
int current = heap.size() - 1;
while (priority > parentPriority(current)) {
swap(current, parent(current));
current = parent(current);
}
}
private int parentPriority(int index) {
return heap.get(parent(index)).priority;
}
private int parent(int index) {
return (index - 1) / 2;
}
private void swap(int x, int y) {
Node temp = heap.get(x);
heap.set(x, heap.get(y));
heap.set(y, temp);
}
public Object poll() {
if (heap.size() == 1)
return heap.remove(0).object;
Object result = heap.get(0).object;
reheapify();
return result;
}
private void reheapify() {
moveLastToFirstSlot();
int i = 0;
while (priorityLessThanEitherChild(i)) {
int child = highestPriorityChild(i);
swap(i, child);
i = child;
}
}
private void moveLastToFirstSlot() {
Node rightmost = heap.remove(heap.size() - 1);
heap.set(0, rightmost);
}
private boolean priorityLessThanEitherChild(int index) {
int priority = heap.get(index).priority;
return priority < leftPriority(index)
|| priority < rightPriority(index);
}
private int highestPriorityChild(int index) {
return leftPriority(index) > rightPriority(index)
? PriorityQueue.leftChild(index)
: PriorityQueue.rightChild(index);
}
int rightPriority(int index) {
return priority(rightChild(index));
}
int leftPriority(int index) {
return priority(leftChild(index));
}
private int priority(int index) {
if (index > heap.size() - 1)
return Integer.MIN_VALUE;
return heap.get(index).priority;
}
List heap() {
return heap;
}
static int rightChild(int index) {
return index * 2 + 2;
}
static int leftChild(int index) {
return index * 2 + 1;
}
}
What else is there to complain about?
-
My test names could use some work. It’s hard to derive good test names when you’re driving incremental implementation of an algorithm, instead of an interface.
-
I probably have some tests I can get rid of.
-
The tests could be composed better. Worse, the test knows about the Node class (which the client need not know about). Maybe binary heap verification should be built into the heap implementation itself.
-
Should there be a PriorityQueue class and a separate Heap class? I thought about this a while and didn’t see the value.
-
I don’t use the Comparable interface to “simplify” and externalize comparisons (i.e. allow the client to specify how priorities compare).
-
I don’t have any tests that show what happens when priorities are equal.
-
I don’t use safety braces (i.e. braces around single-line if statements)–that’s why I have tests. I don’t make everything private that could be private (package access is fine by me, particularly in tests since I don’t have to type the access modifier).
What do I like?
-
I was able to test-drive the implementation, sticking to the five-minute rule for getting green bars. Things were a bit close a couple times (was there a more incremental way?).
-
I think the algorithm as implemented is very easy to follow (read each of the
add
andpoll
methods), particularly when compared to what is typically built (here’s one implementation; see, for example, the percolateDown method).
What do you think? Is test-driving a priority queue a good kata? I lay down the challenge flag for a better progression and implementation (shouldn’t be hard).