Archive Page 3

Pulse 0.13.5 Adds Sparklines and Dynamic HTML

Pulse – Real-time interactive Dashboards 0.13.5 adds Sparkline and Dynamic HTML support

  • Dynamic HTML – Full user control to generate HTML using template languages
  • Sparklines – Embed small charts within a table by specifying nested arrays.

DOWNLOAD NOW

Pulse is designed to provide real-time interactive dashboards so the underlying database has to be really fast. Pulse can support almost any data source, the question is which databases are worth supporting.

Time-series databases are the fastest growing database sector (image below).
The great news is that in the last few years there has been a a lot of interesting new entrants. So we’ve updated our past articles:

  1. Top Column-Oriented DatabasesDuckDB, Clickhouse and Doris are the new exciting entrants. Benchmark results in article.
  2. Top Time-Series Databases – Have exploded in popularity. QuestDB and TimeScale are the new entrants there.  Benchmark results in article.

Building kdb+ Trader Dashboards
Kdb+ Streaming
Pulse – kdb+ Streaming Subscriptions

kdb+ Learning Curve

I present the kdb+ learning curve:

kdb learning curve

Admittedly it has got a little better in the last ten years:
– Google will return some results that may contain useful solutions
– The documentation online has grown massively. q for mortals, timestored material
– There have been multiple books written

But very little has changed to make the language more friendly.
– There is a debugger but it’s not very user friendly.
– The error messages are still cryptic

Some parts probably can’t be helped…. right-to-left recursion is always going to surprise people but it would be nice to see some attempts.

Pulse – as a qStudio alternative

qStudio is the number one code editor, server browser and development environment for kdb+.
Today we are launching Pulse, a real-time SQL visualization tool for almost any SQL database.
Within Pulse we have recreated almost all the functionality of qStudio in web form.

If you’ve ever wanted to:

  • Share queries and results
  • Run analysis from any machine with a browser without a need to install qStudio
  • Chart results using a modern charting library

You should consider using Pulse as a shared HTML5 based complement to qStudio..

As you can see below, pulse can be made to look almost the exact same as qStudio. It features the familiar configurable layout with a table/console/charting result panel that allows you to see your result in all formats at once.

Additionally you can

  • Bookmark,copy and share links
  • Use it with kdb+, postgresql, mysql, oracle and H2 databases
  • Try many more chart types including 3D.

Try Pulse Now

kdd 2019 Roadmap – Cloud and Community

This article was drafted in 2019, given it’s 2021 it made an interesting look back and sadly still a look forward….

There have been two big changes in the software world:

  • The Cloud
  • Community collaboration

Winners and Losers

From these shifts, there have been winners and losers

  • Community helped Wikipedia build the best encyclopaedia
    relegating Britannica and Encarta to history.
  • Community helped linux become the dominant operating system
    Solaris/OS2 systems are now, only used in legacy niches
  • Community developed python is replacing matlab
  • Cloud has seen atlassian/github/amazon/salesforce etc . win by offering SAAS solutions
    to replace what would previously have been locally installed software (SAP/perforce)
  • Cloud hosted Gmail/Hotmail has replaced companies running their own mail servers

If kdb doesn’t change it will become a legacy platform with developers maintaining legacy systems that over time will be replaced with modern cloud alternatives.

Therefore we are starting two initiatives:

Cloud native KDB

  • A fully-managed time-series database hosted on google cloud
  • Able to be signed up for and used within 10 minutes
  • Clear predictable pricing based on storage and query usage
  • Hiding all the complexity of kdb (no par.txt/segments/sym file manipulation)
  • While providing access to the speed and expressiveness of the language
  • Taking advantage of modern load balancing (kubernetes) And cheap storage (S3)

We have a skunkworks team based in their own office, tasked with making a kdb database cloud solution so reliable and feature rich your kdb expert can now stop working to keep the database running and instead focus on business problems.

Community Driven q

We want kdb to run everywhere, for the barriers to adoption to drop and for the language to expand what it can do. A new kdb user will be able to run kdb on their machine through their standard package manager and to access a whole library of utilities to help them with whatever task they are trying to achieve.

KDB everywhere

To do this, we’ve formed a committee including representatives from finance/education and the wider community to:

  • Open source the q language
    • Development possibilities will be opened up to the wider community as anyone can submit ideas or even PRs for experimental functionality
    • Being open source allows kdb to be bundled with linux and we see this as allowing wider use of q scripts
  • Create a hosted packaging system that allows reusing code easily similar to NPM/maven
    • Providing a wider library of community maintained packages that are easily reusable
    • Work with aquaq to migrate parts of their torq framework to provide a kdb standard library
    • Work with the community to onboard some of their code as packages
      e.g. TimeStored is donating qunit
  • Provide a recommended SDLC for kdb. Over the years we’ve developed processes for end to end development of q code at scale and we will be providing that same tooling to everyone.

By both open sourcing the language and allowing easier development of shareable packages we accelerate the pace at which kdb can help all developers solve problems and share solutions. Making the kdb platform stronger for everyone.

The Future of kdb is with you

It’s an exciting time and the demand for storing and analysing large time-series is growing. We believe by becoming cloud first and community driven we can continue to provide solutions for many years to come.

kdb – Feature Wishlist

Features I want:

  1. Open Sourced kdb (a person can dream). As one of the top 5 tools in my programmers toolbox it’s frustrating that kdb is closed source. I can’t use the tool everywhere and at any time the price can be increased.
  2. Increase ease of Use
    • Block user queries that will obviously kill the database (select from quote).
    • Do not quit out when a query takes too much memory (-w exceeded or all RAM/swap on box gone.). Sensibly return an error and keep going.
  3. Faster Speed – Admittedly this isn’t a strong requirement for any work I do but it irritates me as a programmer to know some easy 10x speed improvements are not being used.
    • Perform warmup queries and counts on startup automatically to get most recent data into memory.
    • Replace the kdb/q code with CPU vector functions
    • Parse the user query and optimize it. If a user sends “select from trade where a=1,b=2,c=3,d=3” automatically order the evaluation of the where clause to at leaast prioritize those with attributes.
  4. Marketing – I didn’t think this would be on my wishlist…but if you can market kdb better I would love to stop having people suggest I use mongodb/hadoop/latestFad when kdb is a great fit for the problem at hand.

qStudio Dot Graph Rendering of FIX Order Status

“The Financial Information eXchange (FIX) protocol is an electronic communications protocol initiated in 1992 for international real-time exchange of information related to the securities transactions and markets.”. You can see an example of a FIX message being parsed here.

What we care about is that an order goes through a lifecycle. From newly created to filled or removed. Anything that involves state-transitions or a lifecycle can be visualized as a graph. A graph depicts transitions from one state to another. Often SQL tables record every transition of that state. This can then be summarised into a count of the last state, giving something like the following:

From To label cnt
PendingCancel Calculated Rejected 50
PendingReplace Calculated Rejected 10
PendingReplace Calculated Replaced 40
Calculated PendingReplace PendingReplace 50
Calculated Filled Trade 9400
Calculated Calculated Trade 5239
PendingCancel Removed Cancelled 150
Calculated PendingCancel PendingCancel 200
New Calculated Calculated 9660
New Removed Rejected 140
Created Removed Rejected 300
Created New New 9800

qStudio now automatically converts this result table to DOT format and if you have graphviz“>graphviz installed and on the PATH, will generate the following:

Note I did tweak the table a little to add styling like so:

update style:(`Filled`Removed!("color=green";"color=red")) To,label:(label,'" ",/:cnt) from currentFixStatus

The format is detailed again in our qStudio Chart Data Format page.

This is another even simpler example:

2018 – The Future of Tech in Banks, particularly Market Data (Part 1)

The structure of banks and finance firms are constantly changing as they evolve towards the structure best for todays environment. The trend over recent years has been for less traders and more engineers as expanded in this (article. (thanks Zak). In these posts I’ll describe the current state and where I think fintech, in particular market data capture and kdb are going.
big-fish eats little-fish

(Newer firms that are) “Tech savvy, led by quants and data engineers rather than the expensive traders sitting on the scrap heap of most banks’ inferior tech, the new entrants now just need people with the skills to win over large numbers of customers.”

Banks as a Stack

Think of banks as a stack of services sitting ontop of each other [1]:

Communication within the system is mostly between the layers. Top layers rely on all the services of the layers beneath. e.g. A trader relies on a trading application, that relies on an internal web framework, that relies on a database, that relies on hardware. If we get more traders that need additional software changes, that could transmit down the stack into a request for more hardware. Communication outside the layer model, e.g. Sales asking for additional SAN storage is exceedingly rare.

Within the stack, I’ve highlighted in bold where market data capture sits. I believe most the points I’ll make can be applied wider to other areas within the stack but I’ll stick to examples within the area I know. Sometimes the “market data” team will include responsibility for Feeds, sometimes there will be a core team responsible for the database software they use, sometimes there won’t, but it captures the general structure.

Issues with the current Stack

Note each box on the diagram I refer to as a silo. A silo may be one team, multiple teams or a part of a team but generally it’s a group responsible for one area, looking after it’s own goals.

  • Communication between silos is slow – Currently communication between silos consist of meetings, phone calls and change tickets. Getting anything done quickly is a nightmare. [2]
  • Duplication of Effort – The simplified model above can often be heavily duplicated. e.g. FX, Equities, Fixed income may have separate teams responsible for delivering very similar goals. e.g. An FX Web GUI team, An equities Web GUI team. Losing all benefit of scale. [3]
  • Misalignment of Incentives – Each silo has it’s own goals which often do not align with the overall goals of the layers above or below. e.g. The database team
    may be experts in Oracle, even if an application team thinks MongoDB is the solution for their problem, the database team are not incentivised to supply/suggest/support that solution.[4]
  • Incorrectly Sized Layers – At any time, certain silos or layers within the stack will have too many or too few resources. The article linked at the top of this post suggests the layers should be a pyramid shape, i.e. Very few sales/traders to meet todays electronic market needs. We should be able to contract/expand silos dynamically as required.

Possible Solutions coming in 2018

There are a number of possible solutions to the issues above available today, unfortunately I will have to expand on that in a future post. I am very interested in hearing others views.
Do you believe the stack and issues highlighted are an accurate representation? Solutions you see coming up? Either comment below or drop me an email.

I will hopefully post [part 2] shortly, if you want notified when that happens, sign up to our mailing list.

Notes:

  1. For some reason this reminds me of the OSI 7 layer model.
  2. Amazon try to escape this communication overhead by making everything an API
  3. Customers may prefer a bank that supplies all services but divisions within banks are too big to enforce conformity. Both limitations likely due to Dunbar Number
  4. Even within a single team, the modern workplace may create conflicting loyalties.

kdb – 2017 in Review

Notable events this year or possibly the previous year due to incoherent memory issues:

  • KX went open on APIs – Improved and open sourced python, R, java and kafka interfaces.
    • Java Driver – Got some new serialization functionality
    • PyQ – KX acquired the rights
    • The fusion/interface/machine-learning team at kdb promise to keep bringing improvements
  • KX went to the cloud – There is now a cloud offering of kdb that is dynamically costed based on usage. It’s for existing customers only so far. Beta is available for personal use but kx may terminate access at any time. You can’t run it on third party “clouds”, no AWS I guess.and costs $0.10 per core <=4 cores, $0.05 per core >4 cores.
  • Other users outside finance start to use kdb – It’s great to see and this probably flows from First Derivatives (FD) having purchased KX. However a number of them seem like proof of concepts pushed by FD to demonstrate it can be used. Hopefully in 2018 we will see more independently operating users.
    • European Space Agency (ESA) – Al Worden an actual astronaut came to the London meetup with some great stories.
    • Partnerships with redbull racing and marketing companies demonstrate possible growth opportunities
  • Technical:
    • Debugger with Stack Trace – You can now change the number of threads after startup
    • uj/ij changes – A change in the behaviour of ij/lj joins means we now have ljf/ujf functions to provide historical equivalents. This is an old change but worth mentioning here as more people are only now upgrading from kdb 2.x
    • Analyst – a jupyter notebook / tableau for kdb – KX launched an “analyst” product “a complete real time data transformation, exploration and discovery workflow. Using an intuitive point and click interface, the typical analyst can import, transform, filter, and visualize massive datasets without programming”

kdb lj ij uj joins and upgrading 2.6 to 3.x

A quick post to highlight something a lot of people are bumping into with upgrades. The joins in 3.x for uj/ij and lj all changed how they treat nulls from the keyed table. In particular nulls now by default overwrite existing values. In the past nulls from the joining table did not overwrite and left the original value in the column. See the difference in the 3/three row shown below:


q)t:([] a:1 2 3; b:`one`two`three; c:1.0 2.0 3.0)
q)u:([a:2 3 4] b:`j``l; c:100 200 300.0)

q)t
a b c
---------
1 one 1
2 two 2
3 three 3

q)u
a| b c
-| -----
2| j 100
3| 200
4| l 300

q)t lj u / v3.x The null from u overwrites previous value in column b
a b c
---------
1 one 1
2 j 100
3 200

q)t ljf u / v2.0 or ljf - The original 3 value not overwritten by null
a b c
-----------
1 one 1
2 j 100
3 three 200

Other than the int/long indexing change this is one of the biggest breaking changes in migrating kdb 2.x to 3.x.

You may also enjoy our full kdb joins article.

qStudio 1.43 Released – mac save bug fixed

qStudio 1.43 Released. This:

  • Adds stack traces to kdb 3.5+
  • Fixes the mac bug where the filename wasn’t shown when trying to save a file.
  • Fixes a number of multi-threading UI problems

Download it now.