Let's consider a fairly standard splayed table and then how we can speed up querying it. The table:

The query we want to optimizes our system for is findRows:

First Optimize the Query

The first where clause of a select query should always be the most restrictive [SQL Notes] i.e. the one that returns the fewest rows. The s column has many more unique items than the ex column, therefore filtering on it first will restrict the results faster. Let's consider the timings:

Notice here I am using the \t function which runs a query and returns the time it take to ran in milliseconds. If I supply a number \t:n, then the query is ran n times and the total time returned. This is explained more in our query timing tutorial

Add on-disk Attributes?

The next step we can consider taking is adding on-disk attributes to the columns used in the where clause. We will add a `g# attribute to both columns and use getFileSizes to show the space required on disk

We've now made the same query, 5 times faster using on-disk attributes with our splayed table but can we do better..?

Use in memory-attributes

Rather than store the attributes on-disk, requiring disk space and slowing access we could store them in-memory. This assumes that we have sufficient RAM which for a splayed table can quite often be the case.

We've successfully gone from 1821 milliseconds to run our query, to 16 milliseconds to run the same query. We hope you found this useful. This tutorial is a small subsection of one module from our full kdb+ training courses. If you are new to kdb+, you will find our Introductory kdb+ training course the best way to master kdb+.