Archive for the 'kdb+' Category

kdd 2019 Roadmap – Cloud and Community

This article was drafted in 2019, given it’s 2021 it made an interesting look back and sadly still a look forward….

There have been two big changes in the software world:

  • The Cloud
  • Community collaboration

Winners and Losers

From these shifts, there have been winners and losers

  • Community helped Wikipedia build the best encyclopaedia
    relegating Britannica and Encarta to history.
  • Community helped linux become the dominant operating system
    Solaris/OS2 systems are now, only used in legacy niches
  • Community developed python is replacing matlab
  • Cloud has seen atlassian/github/amazon/salesforce etc . win by offering SAAS solutions
    to replace what would previously have been locally installed software (SAP/perforce)
  • Cloud hosted Gmail/Hotmail has replaced companies running their own mail servers

If kdb doesn’t change it will become a legacy platform with developers maintaining legacy systems that over time will be replaced with modern cloud alternatives.

Therefore we are starting two initiatives:

Cloud native KDB

  • A fully-managed time-series database hosted on google cloud
  • Able to be signed up for and used within 10 minutes
  • Clear predictable pricing based on storage and query usage
  • Hiding all the complexity of kdb (no par.txt/segments/sym file manipulation)
  • While providing access to the speed and expressiveness of the language
  • Taking advantage of modern load balancing (kubernetes) And cheap storage (S3)

We have a skunkworks team based in their own office, tasked with making a kdb database cloud solution so reliable and feature rich your kdb expert can now stop working to keep the database running and instead focus on business problems.

Community Driven q

We want kdb to run everywhere, for the barriers to adoption to drop and for the language to expand what it can do. A new kdb user will be able to run kdb on their machine through their standard package manager and to access a whole library of utilities to help them with whatever task they are trying to achieve.

KDB everywhere

To do this, we’ve formed a committee including representatives from finance/education and the wider community to:

  • Open source the q language
    • Development possibilities will be opened up to the wider community as anyone can submit ideas or even PRs for experimental functionality
    • Being open source allows kdb to be bundled with linux and we see this as allowing wider use of q scripts
  • Create a hosted packaging system that allows reusing code easily similar to NPM/maven
    • Providing a wider library of community maintained packages that are easily reusable
    • Work with aquaq to migrate parts of their torq framework to provide a kdb standard library
    • Work with the community to onboard some of their code as packages
      e.g. TimeStored is donating qunit
  • Provide a recommended SDLC for kdb. Over the years we’ve developed processes for end to end development of q code at scale and we will be providing that same tooling to everyone.

By both open sourcing the language and allowing easier development of shareable packages we accelerate the pace at which kdb can help all developers solve problems and share solutions. Making the kdb platform stronger for everyone.

The Future of kdb is with you

It’s an exciting time and the demand for storing and analysing large time-series is growing. We believe by becoming cloud first and community driven we can continue to provide solutions for many years to come.

kdb – Feature Wishlist

Features I want:

  1. Open Sourced kdb (a person can dream). As one of the top 5 tools in my programmers toolbox it’s frustrating that kdb is closed source. I can’t use the tool everywhere and at any time the price can be increased.
  2. Increase ease of Use
    • Block user queries that will obviously kill the database (select from quote).
    • Do not quit out when a query takes too much memory (-w exceeded or all RAM/swap on box gone.). Sensibly return an error and keep going.
  3. Faster Speed – Admittedly this isn’t a strong requirement for any work I do but it irritates me as a programmer to know some easy 10x speed improvements are not being used.
    • Perform warmup queries and counts on startup automatically to get most recent data into memory.
    • Replace the kdb/q code with CPU vector functions
    • Parse the user query and optimize it. If a user sends “select from trade where a=1,b=2,c=3,d=3” automatically order the evaluation of the where clause to at leaast prioritize those with attributes.
  4. Marketing – I didn’t think this would be on my wishlist…but if you can market kdb better I would love to stop having people suggest I use mongodb/hadoop/latestFad when kdb is a great fit for the problem at hand.

jq supports functions and new keywords.

Jq has now added support for:

  1. Functions – {x+1}. Unnamed parameters beyond x don’t yet work so please name all your parameters.
  2. Keyed table operations: xkey, 1!, 2!, keys, value.
  3. New keywords supported: in, distinct, inter, except, rank, sv, vs, sum, prd, xlog.
  4. Improved compatibility and support of: null, avg, var, iasc, upper, lower, fills, fill, ^, sublist, prds, sums.

The added keywords in most cases will only support the most common types and arguments.
Mixed lists in particular are not handled well by most keywords but we will continue to improve.

JQ functions

Exact Contents of Our Online KDB+ Training Course

We often get asked what is in our online training course.
We do describe this on the course page and in a PDF to be be totally thorough here’s a screenshot of our full listing:

kx kdb – 2019 in Review – Changes

Shakti

The biggest shakeup in the KDB world was Arthur Whitney, the founder of KX and creator of KDB selling his stake in KX and moving on to creating a new version of the K language called Shakti. “Shakti merges database, language, connectivity and stream processing into one powerful platform “.  So far it appears to overlap heavily with kdb functionality, adding further cryptographic features, while not yet supporting on-disk storage.

KDB Version 3.7 Changes:

  • App Direct Mode – give users control over Intel Optane DC Persistent Memory.
  • Multi-Threaded Primitive Operations
  • Data at Rest Encryption.

KDB Version 3.6 Changes:

  • Websocket – Improvements and bugfixes
  • Speed Improvements
    • When attributes present use them more often.
  • Improved Error Reporting
    • Broken or closed handles report their number
    • Fatal memory errors log a timestamp

FD/KX Products:

 

kx kdb – 2018 in Review – Changes

kdb Version 3.6 Changes:

  • Enums and linked columns now use 64 bit indices
    • This is a disk-format change, i.e. newly saved data will NOT be backwards compatible.
    • 3.6 will be able to read data in the old format
  • AnyMap – Mapped Nested Types
    • Ability to save unmappable compound objects with >2 billion elements
    • Mapped list elements can be of any type and are data remains mapped NOT copied to heap.
      • Symbols are automatically enumerated against a file with three ###s in the name.
  • Deferred Response – -30!x Allows a deferred response to a sync query. In practice it is difficult to use correctly.
  • New Functions:
    • .Q.hg – HTTP get allows retrieving web page as a list of strings.
    • .Q.dtps/.Q.dpfts added to allow specifying the enum domain
    • .Q.sha1 – SHA-1 encode text
    • .Q.ts – Allows timing a function call similar to apply “.”.
    • xcol – Now supports dictionary to remap column names
    • -27! to allow formatting similar to .Q.d
    • .j.jd – Allows specifying dictionary of options when calling json serialization.
    • .Q.btoa – Base 64 encode
    • .Q.hp – HTTP Post – .Q.hp[url;mimeType;query]
  • Performance Improvements on
    • grouping
    • filtering
    • particularly when attributes present
  • SSL – Improvements and bugfixes
  • WebSockets – Improvements and bugfixes

KX/FD Shares Fall 30%

First Derivatives Shares have fell back to a price last seen in February 2017:

One cause of the fall has been a damning article by ShadowFall. Their main arguments are:

  • First Derivatives was being priced highly as a software company
  • It is not a software company but a consultancy.
  • Previously good years were due to outside factors (property prices and government grants)
  • They have made a significant investment in KX which may itself have stopped growing

The 47 page report goes into a lot of detail, to give an idea here’s one of the charts:

He shows numerous statistics for FD compared to it’s peers, operating margin, gross margin, revenue, headcount. It’s worth a read if you have an interest in kdb/KX/FD.

Related Links: Shadowfall tweet, Independent.ie.

qStudio 1.45 Released

qStudio 1.45 Released, we have:

  • Bugfix: Ctrl+F Search in source fixed. (Thanks Alex)
  • Added Step-Plot Chart display option
  • Added Stacked Bar Chart display option
  • Added Dot graph render display option (Inspired by Noormo)
  • Bugfix: Hidden folders/files regex now works again in file tree and command bar. Target and hidden folders are ignored by default.
  • Bugfix: Mac was displaying startup error with java 9

Download

Some example charts:


qStudio adds Step Plots for displaying price Steps.

Our standard time-series graph interpolates between points. When the data you are displaying is price points, it’s not really valid to always interpolate. If the price was 0.40 at 2pm then 0.46 at 3pm, that does not mean it could be interpreted as 0.43 at 2.30pm. Amazingly till now, sqlDashboards had no sensible way to show taht data. Now we do:

For comparison here is the same data as a time-series graph:

The step-plot is usable for time-series and numerical XY data series. The format is detailed on the usual chart format pages.

qStudio now supports Stacked Bar Charts

qStudio has added support for stacked bar charts:

The chart format for this is: The first string columns are used as category labels. Whatever numeric columns appear next are a separate series in the chart. Each row in the data becomes one stacked bar. The table for the data shown above for example is:

dt LSE BTS NAS ASE NYQ SES TSE HKG
2018-03-30 1047 2120 592 25 3660 303 225 383
2018-03-29 1148 2118 528 10 3656 541 215 303
2018-03-28 1201 2085 555 17 3644 302 290 339
2018-03-27 1206 2182 535 21 3604 235 299 319
2018-03-26 1239 2041 515 16 3549 251 234 363
2018-03-25 0 0 0 0 0 0 0 0
2018-03-24 0 0 0 0 0 0 0 0
2018-03-23 1379 2115 595 29 3430 138 251 348
2018-03-22 1431 2179 517 25 3399 531 222 320
2018-03-21 1530 2032 558 29 3282 438 296 359
2018-03-20 1531 2134 520 23 3256 515 265 322

You may need to “kdb pivot” your original data to get it in the correct shape.