Kdb has timeseries data specific joins that provide powerful tools for analysing tick data in particular. Due to kdb being column-oriented and based on ordered lists, the syntax is usually much more concise and the speed much faster than standard sql databases.

Contents

  1. Asof Time Join - join one row across based on latest date/time.
    1. aj - Asof join, typically used to find the last value from one table, that matches the source table. e.g. Prevailing quote
    2. aj0 - Same as aj but uses the lookup tables time column
    3. asof - A more limited form of asof join that only returns the lookup values
  2. Time Window Join
    1. wj - For each entry in source table, Find rows that match from the lookup table within or before a given time interval
    2. wj1 - Same as wj but prevailing values not included, only those within the interval.

Asof Time Join

We will use the following simplified trade-t and quote-q tables to demonstrate the various joins.

Quote Table q

Example Quote Table q

Trade Table t

Example Trade Table q

aj

AJ
aj[ cols; sourceTable; lookupTable]
AJ0
aj0[ cols; sourceTable; lookupTable]

For each row in the source table lookup a matching value in the lookup table, by matching on the columns specified in cols. cols is a list of column names where the initial columns MUST match exactly and the last column matches the closest value LESS-THAN in the source table.

sourceTable
The table whos items you want to try and find close matches for, the result will have the same number of rows as this table.
lookupTable
The table used for finding matching data to join, the size and schema of this table will strongly affect the speed.
cols
A list of columns to use for joining on
the initial columns excluding the last will be matched exactly
the last column matches if an entry less-than is found.

aj0

AJ0 is the exact same as aj but returns the lookup tables time column.

asof

Asof is a built-in kdb function, that provides a limited version of AJ, you may find it used occasionally.

An alternative method of viewing time-series data for examing sequential events between tables, is using the union join uj to get a combined table then sorting the full table on time.

Running AJ on large tables

Running time-series joins such as AJ on large amounts of data takes a significant amount of time. By applying a grouped attribute to the sym column we reduced the time from over half a second to under a tenth of a second. You must be careful running aj/wj's, particularly against on-disk data, it is recommended that you consult the documentation on code kx or consult an experienced kdb programmer if you have any issues.

Time Window Join

We will use the following simplified trade-t and quote-q tables to demonstrate the various time window joins.

Quote Table q

Example Quote Table q

Trade Table t

Example Trade Table q

wj

WJ
wj[ windows; cols; sourceTab; (lookupTab;(agg0;col0);(agg1;col1)]
WJ1
wj1[ windows; cols; sourceTab; (lookupTab;(agg0;col0);(agg1;col1)]

For each row in the sourcetable, a time window pair is specified, matches on cols are then found and those that occur within the time window have the aggregate functions applied to the selected columns.

sourceTable
The table whos items you want to try and find close matches for, the result will have the same number of rows as this table.
lookupTable
The table used for finding matching data to join
cols
A list of columns to use for joining on
the initial columns excluding the last will be matched exactly
the last column will match within the specified windows

wj1

The only difference between wj1 and wj, the difference is that where wj pulls in prevailing values not within the time window, wj1 strictly excludes values outside the interval.