Large tables can be compressed in Kdb+ by setting .z.zd
. Compression of data can reduce disk cost and in some cases even improve performance for applications that have fast CPUs but slow disks.
.z.zd
is a list of three integers consisting of logical block size, algorithm (0=none, 1=q, 2=gzip, 3=snappy, 4=lz4hc) and compression level.
Here is an example showing how to compress a table:
// Helper function that sets .z.zd and // returns the previous value of .z.zd .util.setZzd:{ origZzd:$[count key `.z.zd;.z.zd;()]; if[x~(); system"x .z.zd"; :origZzd; ]; .z.zd:x; origZzd} // create a table td:([]a:1000000?10; b:1000000?10; c:1000000?10); // save the table to disk without compression `:uncompressed set td; // save the table to disk using q IPC compression origZzd:.util.setZzd[(17;1;0)]; `:compressed set td; .util.setZzd[origZzd];
You can check compression stats by using the -21! function:
q)-21!`:compressed compressedLength | 5747890 uncompressedLength| 24000041 algorithm | 1i logicalBlockSize | 17i zipLevel | 0i
The size of the file on disk is reduced from 22.8 MB to 5.5 MB after using q IPC compression.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.