Sunday, November 06, 2016

Detecting a Loop in a Linked List

In a previous post, I wrote about how you can detect a loop in a linked list using Floyd's Hare and Tortoise algorithm. You have two iterators - the hare and the tortoise - one of them runs faster than the other and, if the list has a loop, they will eventually collide. Here is a recap of the algorithm:

public static boolean hasLoop(final Node head) {
  Node tortoise = head;
  Node hare = head;
  while (hare != null && hare.next != null) {
    tortoise = tortoise.next;
    hare = hare.next.next;
    if (hare == tortoise) { // collision
      return true;
    }
  }
  return false;
}
How do you find the start of the loop?

Now, let's make this problem more interesting by finding the node at the beginning of the loop.

Consider the following illustration of a linked list with a loop. We need to find the start of the loop, which is 3.

0 - 1 - 2 - 3
           / \
         10   4
          |   |
          9   5
          |   |
          8   6
           \ /
            7

We know that the hare runs twice as fast as the tortoise. Let's say that the tortoise enters the loop after k steps. Then the hare has taken 2k steps in total and is, therefore, k steps ahead of the tortoise within the loop. This means that hare and tortoise are LOOP_SIZE - k nodes away from each other. Since hare moves two nodes for each node that tortoise moves, they move one node closer to each on each turn and will meet after LOOP_SIZE - k turns, and both will be k nodes from the start of the loop at that point. The start of the loop is also k nodes from the beginning of the list.

In my example, the tortoise enters the loop (node 3) after 3 steps. At that stage, the hare has taken 6 steps in total and is at node 6. They collide at node 8 after 5 steps (LOOP_SIZE = 8, k = 3, LOOP_SIZE - k = 5). The start of the loop is 3 nodes away from the point of collision and also 3 nodes away from the head of the list.

So, if we keep the hare where it is (at the point of collision) and move the tortoise back to the beginning of the list, and then move each of them one step at a time, they will collide at the start of the loop!

Here is the algorithm:

public static Node findLoopStart(final Node head) {
  Node tortoise = head;
  Node hare = head;

  boolean hasLoop = false;
  while (hare != null && hare.next != null) {
    tortoise = tortoise.next;
    hare = hare.next.next;
    if (hare == tortoise) {
      hasLoop = true;
      break;
    }
  }

  if (hasLoop) {
    tortoise = head;
    while (tortoise != hare) {
      tortoise = tortoise.next;
      hare = hare.next;
    }
    return tortoise;
  }

  return null;
}

Sunday, October 30, 2016

Java 8: Updates to Map

In Java 8, the Map interface has been updated to include several new, convenient methods. Some of the important changes are outlined below:

1. computeIfAbsent

computeIfAbsent is particularly useful when implementing the caching pattern. For example, previously, you would do this:

private final Map<String, Data> map = new HashMap<>();

public Data getData(final String key) {
  Data data = map.get(key); // look in cache
  if (data == null) { // data is not in cache
    data = computeData(key); // perform an (expensive) operation
    map.put(key, data); // put data in cache
  }
  return data; // return data
}

You can now write this method more concisely as follows:

public Data getData(final String key) {
  return map.computeIfAbsent(key, this::computeData);
}

You can also use computeIfAbsent when implementing a "multi-map" (Map<K, Collection<V>>):

 map.computeIfAbsent(key, k -> new ArrayList<V>()).add(v);
2. putIfAbsent

putIfAbsent puts a value in the map, if the key is not already present. For example, instead of this:

final Map<String, String> map = new HashMap<>();
final String v = map.get(k);
if (v == null) {
  map.put(k, newValue);
}

you can write:

final Map<String, String> map = new HashMap<>();
map.putIfAbsent(k, newValue);
3. getOrDefault

getOrDefault returns a default value, if a specified key is not found in the map. Previously, you would do this:

final Map<String, String> map = new HashMap<>();
return map.containsKey(key) ? map.get(key) : "foo";

Now you can write:

final Map<String, String> map = new HashMap<>();
return map.getOrDefault(key, "foo");
4. Performance

The implementation of HashMap has been updated so that when buckets become too big (see the "TREEIFY_THRESHOLD" setting) they are transformed from lists (with O(n) retrieval) into trees (with faster O(log(n)) retrieval). This improves performance when hashCode() methods return values that are poorly distributed, or when many keys share the same hashCode, so long as they are also Comparable.

Sunday, September 25, 2016

Java 8 Concurrency: LongAdder and LongAccumulator

Java 8 introduces LongAdder, LongAccumulator, DoubleAdder and DoubleAccumulator, which are recommended instead of the Atomic classes when multiple threads update frequently but read less frequently (for example, in the context of gathering statistics). Under high thread contention, these new classes are designed to grow dynamically and, therefore, there is higher throughput at the expense of higher space consumption.

The "Adder" classes support operations for additions, whereas the "Accumulator" clases are given a function to combine values.

To understand how these classes work, you need to take a look at the implementation of the base class, Striped64. It has a "base" field, plus a table of "cells", each of which is a padded variant of AtomicLong to reduce cache contention. It first tries to update the base field using the usual AtomicLong CAS operation. If there is no contention, CAS will succeed. However, if there is contention (i.e. the CAS update of "base" failed), it will use the "probe" value (hash value) of the thread to map to a cell in the table. If a cell does not exist, it will be created. New cells are created upon contention until the number of cells reaches the nearest power of two greater than or equal to the number of CPUs. If the cell exists, CAS will be tried to update the value of the cell. If CAS fails, the probe value will be updated using a secondary hash until an uncontended cell is found. Finally, the sum method simply adds the base field and the elements in the cells array.

The code below shows how you can use LongAdder to calculate the sum of several values:

LongAdder adder = new LongAdder();

// do some addition in different threads
adder.add(10);
adder.increment();
// ...

// get the current sum
long sum = adder.sum();

Or you can use LongAccumulator as follows:

LongAccumulator acc = new LongAccumulator(Long::sum, 0);

acc.accumulate(10);
adder.increment();
// ...

// get the current sum
long sum = acc.get();

You can also use LongAdder in conjuction with a ConcurrentHashMap to maintain a frequency map:

ConcurrentMap<String, LongAdder> freqs = new ConcurrentHashMap<>();
freqs.computeIfAbsent("foo", k -> new LongAdder()).increment();

Sunday, September 18, 2016

Power Set Algorithm

The power set of a set S is the set of all subsets of S including the empty set and S itself. For example, the power set of {1, 2, 3} is {{}, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}}.

The problem can be solved using recursion. We know that if the initial set is empty, then its power set is {{}} and if it has only one element {a} then its power set is {{}, {a}}.

Here is the algorithm:

public static <E extends Comparable<E>> List<List<E>> subsets(final List<E> list) {
  final List<List<E>> result = new ArrayList<>();

  // if the list is empty there is only one subset, which is the empty set
  if (list.isEmpty()) {
    result.add(Collections.emptyList());
    return result;
  }

  // otherwise, get the first element
  final E first = list.get(0);

  // find the subsets of the rest of the list
  final List<E> rest = list.subList(1, list.size());
  final List<List<E>> subsets = subsets(rest);
  result.addAll(subsets);

  // create new subsets by inserting the first element into each subset
  for (final List<E> subset : subsets) {
    final List<E> newSubset = new ArrayList<>(subset);
    newSubset.add(0, first);
    result.add(newSubset);
  }

  return result;
}

Sunday, September 11, 2016

Factorial Using Iteration, Recursion and Java 8 Streams

In this post, we will solve the classic school problem of calculating the factorial of a positive integer, using different Java algorithms.

Note: In all the algorithms below, we assume that the input is > 1.

1. Iterative Factorial

In this algorithm, we use a standard for-loop and increment the variables i and result at each iteration:

static int iterativeFactorial(final int n) {
  int result = 1;
  for (int i = 1; i <= n; i++) {
    result *= i;
  }
  return result;
}
2. Recursive Factorial

Here is the recursive version (the function calls itself):

static int recursiveFactorial(final int n) {
  return n == 1 ? 1 : n * recursiveFactorial(n - 1);
}

Recursion lets you get rid of variables that are updated at each iteration. Typically, making a recursive function call is more expensive than iteration because each time the function is called, a new stack frame has to be created on the call stack to hold the state of each function call. Therefore, the memory consumed by the recursive function is proportional to the size of its input. You may even get a StackOverflowError with a large input.

3. Tail-recursive Factorial

Here's a tail-recursive definition of factorial. The recursive call is the last thing that happens in the function. In contrast, in our previous definition of factorial, the last thing was n multiplied by the result of the recursive call.

static int tailRecursiveFactorial(final int n) {
  return tailRecursiveFactorialHelper(n, 1);
}
private static int tailRecursiveFactorialHelper(final int n, final int acc) {
  return n == 1 ? acc : tailRecursiveFactorialHelper(n - 1, n * acc);
}

Tail-recursive functions can be optimised by the compiler in some programming languages. Instead of storing each intermediate result of the recursion onto different stack frames, a single stack frame can be reused instead, because there's no need to keep track of intermediate results. Unfortunately, Java does not support this kind of optimisation, but you might want to take this approach anyway, in case the compiler is optimised in the future. Other JVM languages (e.g. Scala) can optimise tail-recursion.

4. Stream Factorial

Finally, Java 8 streams provide a simple, declarative way of defining factorial:

static int streamFactorial(final int n) {
  return IntStream.rangeClosed(1, n).reduce(1, (a, b) -> a * b);
}