Archive for the 'kdb+' Category

Reconcile tickerplant feeds using longest common subsequence matching

Sanket Agrawal just posted some very cool code to the k4 mailing list for finding the longest common sub-sequences. This actually solves a problem that some colleagues had where they wanted to combine two tickerplant feeds that never seemed to match up. Here’s an example:


q)/ assuming t is our perfect knowledge table

q)t
time sym size price
------------------------
09:00 A 8 7.343062
09:01 A 0 5.482385
09:02 B 5 8.847715
09:03 A 1 6.78881
09:04 B 5 3.432312
09:05 A 0 0.2801381
09:06 A 2 3.775222
09:07 B 3 1.676582
09:08 B 7 7.163578
09:09 B 4 3.300548

Let us now create two tables, u and v, neither of which contain all the data.

q)u:t except t 7 8
q)v:t except t 1 2 3 4
q)u
time sym size price
------------------------
09:00 A 8 7.343062
09:01 A 0 5.482385
09:02 B 5 8.847715
09:03 A 1 6.78881
09:04 B 5 3.432312
09:05 A 0 0.2801381
09:06 A 2 3.775222
09:09 B 4 3.300548
q)v
time sym size price
------------------------
09:00 A 8 7.343062
09:05 A 0 0.2801381
09:06 A 2 3.775222
09:07 B 3 1.676582
09:08 B 7 7.163578
09:09 B 4 3.300548

We can find the indices that differ using Sankets difftables function

q)p:diffTables[u;v;t `sym;`price`size]
q)show p 0; / rows in u that are not in v
1 2 3 4
q)show p 1; / rows in v that are not in u
3 4

/ combine together again

q)`time xasc (update src:`u from u),update src:`v from v p 1
time sym size price src
----------------------------
09:00 A 8 7.343062 u
09:01 A 0 5.482385 u
09:02 B 5 8.847715 u
09:03 A 1 6.78881 u
09:04 B 5 3.432312 u
09:05 A 0 0.2801381 u
09:06 A 2 3.775222 u
09:07 B 3 1.676582 v
09:08 B 7 7.163578 v
09:09 B 4 3.300548 u
q)t~`time xasc u,v p 1
1b

The code can be downloaded at:
http://code.kx.com/wsvn/code/contrib/sagrawal/lcs/miller.q

http://code.kx.com/wsvn/code/contrib/sagrawal/lcs/myers.q

Thanks Sanket.

Algorithm details:

Myers O(ND): http://www.xmailserver.org/diff2.pdf

Miller O(NP):  http://www.bookoff.co.jp/files/ir_pr/6c/np_diff.pdf

q for mortals – Version 3

The excellent Q For Mortals: A Tutorial In Q Programming by Jeffry Borror will soon be updated to version 3.
Jeff mentioned it at the recent NY user meeting.

You can read q for mortals online for free at: http://code.kx.com/wiki/JB:QforMortals2/contents
version 2 – added a new chapter on kdb database disk storage.

He said to get in touch if there’s anything you feel you would really like added.

UPDATE December 2015

q for mortals 3 is now out.

Q for Mortals Version 3 is a thorough presentation of the q programming language and an introduction to the kdb+ database. It is a complete rewrite of the original Q for Mortals that is current with q3.3. The presentation is derived from classes taught by the author at international financial institutions over the last decade. It is a series of tutorials based on q snippets intended to be entered interactively into the q console by the reader. The text takes its subject seriously but not itself. Technical explanations are augmented by mathematical observations, references to general programming concepts and other programming languages, and bad jokes. Coding style recommendations and advice to avoid gotchas appear liberally throughout. Examples are as simple as they can be but no simpler.

  • Chapter 1, Q Shock and Awe, provides a piquant panorama of the power of q and its dazzling zen-like nature.
  • Chapter 2 describes the base data types of q.
  • Chapter 3 discusses lists, the fundamental data structure of q
  • Chapter 4 presents the basic operators.
  • Chapter 5 introduces dictionaries, which associate keys and values.
  • Chapter 6 presents an in-depth description of functions and q’s constructs for functional programming.
  • Chapter 7 demonstrates transforming data from one type to another.
  • Chapter 8 introduces tables and keyed tables, the fundamental data structures for kdb.
  • Chapter 9 describes q-sql and all the methods to manipulate tables.
  • Chapter 10 presents ways to control execution of q programs.
  • Chapter 11 covers file and interprocess communication I/O
  • Chapter 12 describes workspace organization and management.
  • Chapter 13 discusses system commands and command line parameters.
  • Chapter 14 serves as an introduction to the kdb+ database. M

Available at all good bookstores.
http://www.amazon.com/For-Mortals-Version-Introduction-Programming/dp/0692573674
qTips is also good: http://www.amazon.co.uk/Tips-Fast-Scalable-Maintainable-Kdb/dp/9881389909

want to learn kdb? here’s some new tutorials

Added four new tutorials for those starting to learn kdb:

qStudio for Kdb 1.26 Released

q Code File Browser and Adding Multiple Kdb Servers

qstudio kdb file tree

Added IDE Features:

  • Add File Tree that allows browsing directory and providing autocomplete
  • qDoc supports custom user tags (Thanks Aaron)
  • Allow adding/exporting whole lists of servers at once (much quicker)
  • Installers are now signed.
  • Ctrl-D “goto definition” of function to open that file/position
  • (PRO) Unit Testing and function profiling partially integrated.

Download qStudio 1.26 for Kdb Database

qStudio Kdb IDE 1.25 Released

Added qStudio Features:

qstudio kdb ide with custom font

  • Faster chart drawing (~1.6x faster)
  • Added No Redraw chart option for those who want extra speed
  • Numerous bugfixes to charts that froze
  • Allow setting code editor font size
  • Fix display of boolean/byte lists

kdb+ New York User Meeting – March 11th 2013

Speakers Included:

Dennis Shasha – Fun with Timeseries

Interesting fun talk on finding the highest correlated streams among thousands of streams extremely efficiently by comparing streams against random generated data rather than exhaustively against each other. Followed up by pattern detection over different time windows efficiently which I think is this paper: .

Aaron Davies – (dis-)functional select and de-queueing bugs

Slides available here.
To those that have ever been forced to write functional selects, the shortcut notation allows a much more readable form.

Joe Landman – Performance vs. IO & Memory etc.

A topic I find extremely interesting, unfortunately the material was probably too densely packed and I couldn’t follow along well. Hopefully the slides will be released so I can get the details at my own pace. Joe has a great blog http://scalability.org full of high performance articles.

Simon Garland – Kdb 3.0 3.1f

Multithreaded slaves as standard – start kdb with an overridden -s and you can run separate kdb processes on different parts that will each perform part of the query. Standard slaves with par.txt were proving less worthwhile especially for SSD’s. Curious to see how this will compare to mserve.
-23! – optimise reading entire tables by mapping them in all at once. Really I want this to happen automatically (query optimizer please) but I guess we have to be happy with the faster speed.

discussion panel: “Components, Frameworks and Nifty Internal Tools – The Good, Bad and Ugly?”

Dave Thomas, Bedarra
Ed Bierly
Igor Kantor, BAML
Nate McNamara, Morgan Stanley
Ryan Hamilton, Timestored
At times everyone seemed to agree their frameworks were good but others were questionable. Interesting to me was questioning from the audience on creating an open source library of kdb tools…

UK Kdb Developers have longer attention spans

On TimeStored we monitor visitor statistics, so assuming that most visitors are going to work in kdb this gives us insight into where kdb consultants/developers are located, what OS/browser they use and how long they spent on the site. For example:

kdb consultant locations

kdb locations

If you work in kdb your probably underneath a dot on this map.

 

Kdb Developers by Country

Kdb Developers by Country

Kdb Developers by City

Kdb Developers by City

We can see if you work in kdb your probably located in either the UK or US, mostly London or New York. It’s interesting to see Belfast and Newry show up but easily explained as First Derivatives HQ is in Newry and both Citi and First Derivatives have offices in Belfast with kdb staff.

Now that we have seen where kdb developers are located let’s consider what OS/browser they use:

Kdb Developer Attention Span

Kdb Developer Attention Span

kdb developer web browsers

kdb developer web browsers

kdb user operating systems

kdb user operating systems

UK users spent 1 minute longer on the site than their US counterparts 🙂 Though this could partially be explained by the site being more responsive in the UK as the hosting was based there.

Surprisingly (to me) chrome was the number one browser and someone was using freebsd!