September » 2014 »

Archive for September, 2014

Command Line Kdb+ Charts

September 8th, 2014 by admin

sqlDashboards are included as a bundle with qStudio, part of that package is a command line utility called sqlChart that allows generating customized sql charts from the command line.

Checkout the video to see how you can create a chart based on data from a kdb+ database in 2 minutes:

The sqlChart page has all the documentation you need, Download the qstudio.zip to try it now.

The q Code

C:\temp\ch\qstudio>sqlchart -s kdb -P 5000 -e "([] dt:2013.01.01+til 21; cosineWave:cos a; sineWave:sin a:0.6*til 21)" -c timeseries -W 600 -t  dark
C:\temp\ch\qstudio\out.png

Help Screen

C:\temp\ch\qstudio>sqlchart
Option (* = required)             Description
---------------------             -----------
-?, --help                        Display a help message and exit.
-D, --database <db_name>          The database to use.
-H, --height <output_height>      Set the height of the chart output
                                    (default: 300)
-P, --port <port_num>             The TCP/IP port number to use for the
                                    SQL Server connection.
-W, --width <output_width>        Set the width of the chart output
                                    (default: 400)
-c, --chart <chart_type>          Set the selected chart type. Options
                                    available: timeseries, areachart,
                                    barchart, bubblechart, candlestick,
                                    datatable, heatmap, histogram,
                                    linechart, noredraw, piechart,
                                    scatterplot (default: barchart)
-e, --execute <sql_statement>     Execute the selected sql statement.
-h, --host <host_name>            SQL server host that will be queried.
                                    (default: localhost)
-o, --out <file_name>             The name of the destination image
                                    file. (default: out.png)
-p, --password <password>         Password used to connect to SQL server.
* -s, --servertype <server_type>  The type of sql server being queried.
                                    Valid values include:kdb,mysql,
                                    postgres,mssql,h2.
-t, --theme <color_theme>         Set the color theme for the chart.
                                    Options available: light,dark,pastel
                                    (default: light)
-u, --user <user_name>            Username used to connect to SQL server.

Tags: kdb chart, kdb+, qstudio, timeseries. kdb+,qStudio Add a comment

Pipe-lining Time Series Calculations for Cache Efficiency

September 8th, 2014 by admin

I always like to investigate new technology and this week I found a nice automatic technique for improved cache use that I had previously seen some people manually write.

Consider a database query with three steps (three SQL SELECTs), some databases may pass results of each step to temporary tables in main memory. When the first step is finished, these intermediate results are passed back into CPU cache to be transformed by the second step, then back into a new temporary table in main memory, and so on.

To eliminate this back-and-forth, vector-based statistical functions can be pipelined, with the output of one function becoming input for the next, whose output feeds a third function, etc. Intermediate results stay in the pipeline inside CPU cache, with only the full result being materialized at the end.

This technology is part of ExtremeDB, they have a video that explains it well:

Time Series Calculations

Moving Averages Stock Price Example

This is what the actual code would look like to calculate the 5-day and 21-day moving averages for a stock and detect the points where the faster moving average (5-day) crosses over or under the slower moving average (21-day):

select seq_map(ClosePrice,
    seq_cross(seq_sub(seq_window_agg_avg(ClosePrice, 5), 
    seq_window_agg_avg(ClosePrice, 21)), 1)) 
from Security;

Two invocations of ‘seq_window_agg_avg’ execute over the closing price sequence, ‘ClosePrice’, to obtain 5-day and 21-day moving averages.
The function ‘seq_sub’ subtracts 21- from 5-day moving averages;
The result “feeds” a fourth function, ‘seq_cross’, to identify where the 5- and 21-day moving averages cross.
Finally, the function ‘seq_map’ maps the crossovers to the original ‘ClosePrice’ sequence, returning closing prices where the moving averages crossed.

This approach eliminates the need to create, populate and query temporary tables outside CPU cache in main memory. One “tile” of input data is processed by each invocation of ‘mco_seq_window_agg_avg_double()’, each time producing one tile of output that is passed to ‘mco_seq_sub_double()’ which, in turn, produces one tile of output that is passed as input to mco_seq_cross_double(), which produces one tile of output that is passed to mco_seq_map_double(). When the last function, mco_seq_map_double() has exhausted its input, the whole process repeats from the top to produce new tiles for additional processing.

A very cool idea!

And yes, ExtremeDB are the same guys that posted the top Stac M3 benchmark for a while (in 2012/13 I think).

Tags: column database, tick data, time series, timeseries database. timeseries Add a comment

TimeStored Blog