Archive for July, 2016

Open Source Alternative to kdb?

I often get asked what open source alternatives are there to kdb+. The answer depends on what you are trying to do. IF there was a product XYZ that provided some similar features, whether it can replace kdb depends on a few issues:

>>”What will XYZ bring us that kdb doesn’t?”
Kdb has been tried and tested over many computer/man-years. The KX team have fixed 1000’s of edge cases, optimization issues and OS specific bugs. Any similar system would have to replicate a lot of that work. Possible but it would take time and teams actually using it. It would also require a corporate entity to provide support and bug fixes together with long term guarantees of availability (not a few part-time committers on github). Ontop of that it would need to deliver more value to make it worth switching.

Kdb is both a database and a programming language and it’s that combination which I believe gives kdb it’s unique power:
– There is no open source database that provides the speed kdb provides for the particular queries suited to finance.
– Combining kdb and basing queries on q-sql/ordered lists (rather than set theory for standard sql) means queries require fewer lines of code. I believe this expressiveness combined with longer term use of kdb/q changes how you think and allows easily forming queries which many people couldn’t begin to write in standard sql.
– However as much as I think q is a selling point of kdb, I know many others would disagree. It takes a reasonable period of time to convince someone non-standard SQL is beneficial.

What is your use case? e.g. Example Queries to Consider:

1. Select top N by category
http://stackoverflow.com/questions/176964/select-top-10-records-for-each-category
select n#price by sym from trade

2. Joining Records on nearest date time:
http://www.bigresource.com/MS_SQL-joining-records-by-nearest-datetime-XsKMeH3t.html
aj[`sym`time;select .. from trade where ..;select .. from quote]

3. Queries dependent on order. (eg price change, subtract row from previous)
http://stackoverflow.com/questions/919136/subtracting-one-row-of-data-from-another-in-sql
select price-prev price from trade....

XYZ would need to support these queries well. Why would I chose XYZ instead of Python/R/J/A+?
Existing (some similar languages) that offer a larger existing user base, more libraries and a proven/stable platform. Unless a way is found to leverage existing languages/libraries XYZ will be competiting for attention against kdb and also python/numpy/julia etc.

>>”bring in the cost factor and should XYZ be considered as a big future player?”
For the target market of kdb the cost is often not the most significant factor in the decision. If kdb can answer questions that other platforms can’t or in a much shorter time, it often adds enough value to make the cost irrelevant. In fact many large firms are happy paying a pricey support agreement for free open source software so that they have someone to (blame) call to resolve an issue quickly.

>>”but could XYZ catch up and begin to be trusted by bigger institutions?”
If XYZ started to be able to answer the three example queries shown above at a reasonable speed multiple perhaps but I consider it unlikely. Kdb is entrenched and for its target use case it is currently unbeatable. Some people may have use cases that don’t need the full power of database and language combined or have other important factors (cost,existing expertise). I think those use cases have viable open source solutions.