Asof AJ WJ Time Series Joins

Kdb has timeseries data specific joins that provide powerful tools for analysing tick data in particular. Due to kdb being column-oriented and based on ordered lists, the syntax is usually much more concise and the speed much faster than standard sql databases.

Asof Time Join - join one row across based on latest date/time.
1. aj - Asof join, typically used to find the last value from one table, that matches the source table. e.g. Prevailing quote
2. aj0 - Same as aj but uses the lookup tables time column
3. asof - A more limited form of asof join that only returns the lookup values
Time Window Join
1. wj - For each entry in source table, Find rows that match from the lookup table within or before a given time interval
2. wj1 - Same as wj but prevailing values not included, only those within the interval.

Asof Time Join

We will use the following simplified trade-t and quote-q tables to demonstrate the various joins.

t:([] 
    time:07:00 08:30 09:59 10:00 12:00 16:00t; 
    sym:`a`a`a`a`b`a; 
    price:0.9 1.5 1.9 2 9. 10.; 
    size:100*6?10);

q:([] 
    time:08:00+`time$60*60000*til 8; 
    sym:`a`b`a`b`b`a`b`a;
    bid:1 9 2 8 8.5 3 7 4.);

Quote Table q

Trade Table t

aj

AJ: aj[ cols; sourceTable; lookupTable]
AJ0: aj0[ cols; sourceTable; lookupTable]

For each row in the source table lookup a matching value in the lookup table, by matching on the columns specified in cols. cols is a list of column names where the initial columns MUST match exactly and the last column matches the closest value LESS-THAN in the source table.

sourceTable: The table whos items you want to try and find close matches for, the result will have the same number of rows as this table.
lookupTable: The table used for finding matching data to join, the size and schema of this table will strongly affect the speed.
cols: A list of columns to use for joining on
the initial columns excluding the last will be matched exactly
the last column matches if an entry less-than is found.

q)t
time         sym price size
---------------------------
07:00:00.000 a   0.9   800
08:30:00.000 a   1.5   100
09:59:00.000 a   1.9   900
10:00:00.000 a   2     500
12:00:00.000 b   9     400
16:00:00.000 a   10    600
q)q
time         sym bid
--------------------
08:00:00.000 a   1
09:00:00.000 b   9
10:00:00.000 a   2
11:00:00.000 b   8
12:00:00.000 b   8.5
13:00:00.000 a   3
14:00:00.000 b   7
15:00:00.000 a   4
q)aj[`sym`time; t; q]
time         sym price size bid
-------------------------------
07:00:00.000 a   0.9   800
08:30:00.000 a   1.5   100  1
09:59:00.000 a   1.9   900  1
10:00:00.000 a   2     500  2
12:00:00.000 b   9     400  8.5
16:00:00.000 a   10    600  4

q)fq:update  qtime:time,qsym:sym from q;
q)ft:update ftime:time, fsym:sym from t
q)aj[`sym`time; ft; fq]
time         sym price size ftime        fsym bid qtime        qsym
-------------------------------------------------------------------
07:00:00.000 a   0.9   800  07:00:00.000 a
08:30:00.000 a   1.5   100  08:30:00.000 a    1   08:00:00.000 a
09:59:00.000 a   1.9   900  09:59:00.000 a    1   08:00:00.000 a
10:00:00.000 a   2     500  10:00:00.000 a    2   10:00:00.000 a
12:00:00.000 b   9     400  12:00:00.000 b    8.5 12:00:00.000 b
16:00:00.000 a   10    600  16:00:00.000 a    4   15:00:00.000 a
q)aj[`time; ft; fq]
time         sym price size ftime        fsym bid qtime        qsym
-------------------------------------------------------------------
07:00:00.000 a   0.9   800  07:00:00.000 a
08:30:00.000 a   1.5   100  08:30:00.000 a    1   08:00:00.000 a
09:59:00.000 b   1.9   900  09:59:00.000 a    9   09:00:00.000 b
10:00:00.000 a   2     500  10:00:00.000 a    2   10:00:00.000 a
12:00:00.000 b   9     400  12:00:00.000 b    8.5 12:00:00.000 b
16:00:00.000 a   10    600  16:00:00.000 a    4   15:00:00.000 a

aj0

AJ0 is the exact same as aj but returns the lookup tables time column.

q)aj[`sym`time; ft; fq]
time         sym price size ftime        fsym bid qtime        qsym
-------------------------------------------------------------------
07:00:00.000 a   0.9   800  07:00:00.000 a
08:30:00.000 a   1.5   100  08:30:00.000 a    1   08:00:00.000 a
09:59:00.000 a   1.9   900  09:59:00.000 a    1   08:00:00.000 a
10:00:00.000 a   2     500  10:00:00.000 a    2   10:00:00.000 a
12:00:00.000 b   9     400  12:00:00.000 b    8.5 12:00:00.000 b
16:00:00.000 a   10    600  16:00:00.000 a    4   15:00:00.000 a

q)aj0[`sym`time; ft; fq]
time         sym price size ftime        fsym bid qtime        qsym
-------------------------------------------------------------------
07:00:00.000 a   0.9   800  07:00:00.000 a
08:00:00.000 a   1.5   100  08:30:00.000 a    1   08:00:00.000 a
08:00:00.000 a   1.9   900  09:59:00.000 a    1   08:00:00.000 a
10:00:00.000 a   2     500  10:00:00.000 a    2   10:00:00.000 a
12:00:00.000 b   9     400  12:00:00.000 b    8.5 12:00:00.000 b
15:00:00.000 a   10    600  16:00:00.000 a    4   15:00:00.000 a

asof

Asof is a built-in kdb function, that provides a limited version of AJ, you may find it used occasionally.

q)`sym`time#t
sym time
----------------
a   07:00:00.000
a   08:30:00.000
a   09:59:00.000
a   10:00:00.000
b   12:00:00.000
a   16:00:00.000
q)q asof `sym`time#t
bid
---

1
1
2
8.5
4
q)t,'q asof `sym`time#t
time         sym price size bid
-------------------------------
07:00:00.000 a   0.9   800
08:30:00.000 a   1.5   100  1
09:59:00.000 a   1.9   900  1
10:00:00.000 a   2     500  2
12:00:00.000 b   9     400  8.5
16:00:00.000 a   10    600  4

An alternative method of viewing time-series data for examing sequential events between tables, is using the union join uj to get a combined table then sorting the full table on time.

q)q uj t
time         sym bid price size
-------------------------------
08:00:00.000 a   1
09:00:00.000 b   9
10:00:00.000 a   2
11:00:00.000 b   8
12:00:00.000 b   8.5
13:00:00.000 a   3
14:00:00.000 b   7
15:00:00.000 a   4
07:00:00.000 a       0.9   800
08:30:00.000 a       1.5   100
..
q)`time xasc q uj t
time         sym bid price size
-------------------------------
07:00:00.000 a       0.9   800
08:00:00.000 a   1
08:30:00.000 a       1.5   100
09:00:00.000 b   9
09:59:00.000 a       1.9   900
10:00:00.000 a   2
10:00:00.000 a       2     500
11:00:00.000 b   8
12:00:00.000 b   8.5
12:00:00.000 b       9     400
..

Running AJ on large tables

q)\l trades.q
(+`date`sym!(2013.09.27 2013.09.28 2013.09.29 2013.09.30 2013.10.01;`RBS`RBS`RBS`RBS`..
q)trade:100?trade
q)count each (trade;quote)
100 1700000
q)meta quote
c    | t f a
-----| -----
date | d   s
time | t
sym  | s
size | i
cond | c
bid  | f
ask  | f
asize| j
bsize| j
q)\t r1:aj[`sym`time; trade; quote]
681
q)\t update `g#sym from `quote
46
q)meta quote
c    | t f a
-----| -----
date | d   s
time | t
sym  | s   g
size | i
cond | c
bid  | f
ask  | f
asize| j
bsize| j
q)\t r2:aj[`sym`time; trade; quote]
0
q)r1~r2
1b
q)

Running time-series joins such as AJ on large amounts of data takes a significant amount of time. By applying a grouped attribute to the sym column we reduced the time from over half a second to under a tenth of a second. You must be careful running aj/wj's, particularly against on-disk data, it is recommended that you consult the documentation on code kx or consult an experienced kdb programmer if you have any issues.

Time Window Join

We will use the following simplified trade-t and quote-q tables to demonstrate the various time window joins.

t:([] 
    time:  09:00 09:04 09:12 09:13t; 
    sym:   `a`a`a`a; 
    price: 10 11 12 13.);

q:([] 
    time: 09:00+`time$60000*til 13; 
    sym: `a`a`a`a`a`b`b`b`a`a`a`a`a;
    bid: asc 9.+13?10);

Quote Table q

Trade Table t

wj

WJ: wj[ windows; cols; sourceTab; (lookupTab;(agg0;col0);(agg1;col1)]
WJ1: wj1[ windows; cols; sourceTab; (lookupTab;(agg0;col0);(agg1;col1)]

For each row in the sourcetable, a time window pair is specified, matches on cols are then found and those that occur within the time window have the aggregate functions applied to the selected columns.

sourceTable: The table whos items you want to try and find close matches for, the result will have the same number of rows as this table.
lookupTable: The table used for finding matching data to join
cols: A list of columns to use for joining on
the initial columns excluding the last will be matched exactly
the last column will match within the specified windows

q)t
time         sym price
----------------------
09:00:00.000 a   10
09:04:00.000 a   11
09:12:00.000 a   12
09:13:00.000 a   13
q)q
time         sym bid
--------------------
09:00:00.000 a   10
09:01:00.000 a   10
09:02:00.000 a   11
09:03:00.000 a   13
09:04:00.000 a   13
09:05:00.000 b   14
09:06:00.000 b   14
09:07:00.000 b   15
09:08:00.000 a   15
09:09:00.000 a   17
09:10:00.000 a   17
09:11:00.000 a   18
09:12:00.000 a   18

q)windows:flip t.time +\: -00:02 00:02t
q)windows
08:58:00.000 09:02:00.000 09:10:00.000 09:11:00.000
09:02:00.000 09:06:00.000 09:14:00.000 09:15:00.000

q)wj[windows; `sym`time; t; (q; (::; `bid))]
time         sym price bid
--------------------------------
09:00:00.000 a   10    10 10 11f
09:04:00.000 a   11    11 13 13f
09:12:00.000 a   12    17 18 18f
09:13:00.000 a   13    18 18f

q)wj[windows; `sym`time; t; (q; (::; `bid); (avg;`bid))]
time         sym price bid       bid
-----------------------------------------
09:00:00.000 a   10    10 10 11f 10.33333
09:04:00.000 a   11    11 13 13f 12.33333
09:12:00.000 a   12    17 18 18f 17.66667
09:13:00.000 a   13    18 18f    18

`time xasc t uj q

wj1

The only difference between wj1 and wj, the difference is that where wj pulls in prevailing values not within the time window, wj1 strictly excludes values outside the interval.

q)win2:(08:58:00.000 09:02:00.000 09:10:00.000 10:10:00.00; 09:02:00.000 09:06:00.000 09:14:00.000 10:15:00.0);
q)
q)win2
08:58:00.000 09:02:00.000 09:10:00.000 10:10:00.000
09:02:00.000 09:06:00.000 09:14:00.000 10:15:00.000
q)windows
08:58:00.000 09:02:00.000 09:10:00.000 09:11:00.000
09:02:00.000 09:06:00.000 09:14:00.000 09:15:00.000


q)wj[win2; `sym`time; t; (q; (::; `bid))]
time         sym price bid
--------------------------------
09:00:00.000 a   10    10 10 11f
09:04:00.000 a   11    11 13 13f
09:12:00.000 a   12    17 18 18f
09:13:00.000 a   13    ,18f

q)wj1[win2; `sym`time; t; (q; (::; `bid))]
time         sym price bid
--------------------------------
09:00:00.000 a   10    10 10 11f
09:04:00.000 a   11    11 13 13f
09:12:00.000 a   12    17 18 18f
09:13:00.000 a   13    `float$()

Asof AJ WJ Time Series Joins

Contents

Asof Time Join

Quote Table q

Trade Table t

aj

aj0

asof

Running AJ on large tables

Time Window Join

Quote Table q

Trade Table t

wj

wj1

See Also

kdb+ Trader Dashboards 26 mins

kdb+ Trade Blotter
3 mins

kdb+ Quote Graph
4 mins

Asof AJ WJ Time Series Joins

Contents

Asof Time Join

Quote Table q

Trade Table t

aj

aj0

asof

Running AJ on large tables

Time Window Join

Quote Table q

Trade Table t

wj

wj1

See Also

kdb+ Trader Dashboards 26 mins

kdb+ Trade Blotter3 mins

kdb+ Quote Graph4 mins

kdb+ Trade Blotter
3 mins

kdb+ Quote Graph
4 mins