Friday, November 20, 2020

Kdb+/q - File Compression

Large tables can be compressed in Kdb+ by setting .z.zd. Compression of data can reduce disk cost and in some cases even improve performance for applications that have fast CPUs but slow disks.

.z.zd is a list of three integers consisting of logical block size, algorithm (0=none, 1=q, 2=gzip, 3=snappy, 4=lz4hc) and compression level.

Here is an example showing how to compress a table:

// Helper function that sets .z.zd and
// returns the previous value of .z.zd
.util.setZzd:{
  origZzd:$[count key `.z.zd;.z.zd;()];
  if[x~();
    system"x .z.zd";
    :origZzd;
  ];
  .z.zd:x;
  origZzd}

// create a table
td:([]a:1000000?10; b:1000000?10; c:1000000?10);

// save the table to disk without compression
`:uncompressed set td;

// save the table to disk using q IPC compression
origZzd:.util.setZzd[(17;1;0)];
`:compressed set td;
.util.setZzd[origZzd];

You can check compression stats by using the -21! function:

q)-21!`:compressed
compressedLength  | 5747890
uncompressedLength| 24000041
algorithm         | 1i
logicalBlockSize  | 17i
zipLevel          | 0i

The size of the file on disk is reduced from 22.8 MB to 5.5 MB after using q IPC compression.

Thursday, November 19, 2020

Testing Expected Exceptions with JUnit 5

This post shows how to test for expected exceptions using JUnit 5. If you're still on JUnit 4, please check out my previous post.

Let's start with the following class that we wish to test:

public class Person {
  private final String name;
  private final int age;
    
  /**
   * Creates a person with the specified name and age.
   *
   * @param name the name
   * @param age the age
   * @throws IllegalArgumentException if the age is not greater than zero
   */
  public Person(String name, int age) {
    this.name = name;
    this.age = age;
    if (age <= 0) {
      throw new IllegalArgumentException("Invalid age:" + age);
    }
  }
}

To test that an IllegalArgumentException is thrown if the age of the person is less than zero, you should use JUnit 5's assertThrows as shown below:

import static org.hamcrest.CoreMatchers.*;
import static org.hamcrest.MatcherAssert.*;
import static org.junit.jupiter.api.Assertions.*;

import org.junit.jupiter.api.Test;

class PersonTest {

  @Test
  void testExpectedException() {
    assertThrows(IllegalArgumentException.class, () -> {
      new Person("Joe", -1);
    });
  }

  @Test
  void testExpectedExceptionMessage() {
    final Exception e = assertThrows(IllegalArgumentException.class, () -> {
      new Person("Joe", -1);
    });
    assertThat(e.getMessage(), containsString("Invalid age"));
  }
}

Related post: Testing Expected Exceptions with JUnit 4 Rules

Tuesday, October 13, 2020

Java 15: Sealed Classes

Java 15 introduces Sealed Classes, a preview language feature, that allows classes/interfaces to restrict which other classes/interfaces may extend or implement them. Here is an example:

public sealed class Vehicle permits Car, Truck, Motorcycle { ... }

final class Car extends Vehicle { ... }
final class Truck extends Vehicle { ... }
final class Motorcycle extends Vehicle { ... }

In the example above, Vehicle is a sealed class, which specifies three permitted subclasses; Car, Truck and Motorcycle.

The subclasses must be:

  • in the same package or module as the superclass. You can even define them in the same source file as the superclass (if they are small in size), in which case the permits clause is not required because the compiler will infer them from the declarations in the file.
  • declared either final (i.e. cannot be extended further), sealed (i.e. permit further subclasses in a restricted fashion) or non-sealed (i.e. open for extension by any class).

Sealing serves two main purposes:

  1. It restricts which classes or interfaces can be a subtype of a class or interface and thus preserves the integrity of your API.
  2. It allows the compiler to list all the permitted subtypes of a sealed type (exhaustiveness analysis), which will (in a future Java release) enable switching over type patterns in a sealed type (and other features). For example, given the following switch statement, the compiler will detect that there is a case statement for every permitted subclass of Vehicle (so no default clause is needed) and it will also give an error if any of them are missing:
    int doSomething(Vehicle v) {
      return switch (v) {
          case Car c -> ...
          case Truck t -> ...
          case Motorcycle m -> ...
      };
    }

Monday, July 20, 2020

kdb+/q - Try Catch

Programming languages typically have a try-catch mechanism for dealing with exceptions. The try block contains the code you want to execute and the catch block contains the code that will be executed if an error occurs in the try block.

Here is an example of a simple try-catch block in Java, which attempts to parse a string into an int and returns -1 if there is an error.

try {
    return Integer.parseInt(x);
} catch (NumberFormatException e) {
    e.printStackTrace();
    return -1;
}

In this post, I will describe the try-catch equivalent for exception handling in the q programming language.

.Q.trp[f;x;g] - for unary functions

For unary functions, you can use .Q.trp (Extend Trap), which takes three arguments:

  1. f - a unary function to execute
  2. x - the argument of f
  3. g - a function to execute if f fails. This function is called with two arguments, the error string x and the backtrace object y

For example:

// Define a function which casts a string to int
parseInt:{[x] "I"$x}

// Define an error function which prints the stack trace and returns -1
// Note: .Q.sbt formats the backtrace object and 2@ prints to stderr
g:{[x;y] 2@"Error: ",x,"\nBacktrace:\n",.Q.sbt y;-1i}

// Try calling the function (wrapped by .Q.trp) with a valid argument
.Q.trp[parseInt;"123";g]
123i

// Try calling the function (wrapped by .Q.trp) with an invalid argument
// The error function is called and the stack trace is printed
.Q.trp[parseInt;`hello;g]
Error: type
Backtrace:
  [2]  parseInt:{[x] "I"$x}
                        ^
  [1]  (.Q.trp)

  [0]  .Q.trp[parseInt;`hello;g]
       ^
-1i

Note: An alternative is to use Trap At which has syntax @[f;x;e] but you won't get the backtrace, so it's better to use .Q.trp.

.[f;args;e] - for n-ary functions

.Q.trp only works for unary functions. For functions with more than one argument, you need to use Trap which has the syntax .[f;args;e]. The error function e does not take any arguments, which means no backtrace available. For example:

// Define a ternary function that sums its arguments
add:{[x;y;z] x+y+z}

.[add;1 2 3;{2@"Failed to perform add";-1}]
6

.[add;(1;2;`foo);{2@"Failed to perform add\n";-1}]
Failed to perform add
-1

Friday, July 10, 2020

Compute MD5 Checksum Hash on Windows and Linux

Use the following commands to print out the MD5 hash for a file.

On Windows:

> CertUtil -hashfile myfile.txt MD5
MD5 hash of file myfile.txt:
76383c2c0bfca944b57a63830c163ad2
CertUtil: -hashfile command completed successfully.

On Linux/Unix:

$ md5sum myfile.txt
76383c2c0bfca944b57a63830c163ad2 *myfile.txt