Archive for the 'Uncategorized' Category

The Data Pyramid is a Lie

If you work with data, at some point you will be presented with a powerpoint similar to this:

Data Pyramid Lie

A wonderful fictional land, where we cleanly build everything on the layer below until we reach the heavens (In the past this was wisdom or visualization, increasingly it’s mythical AI).

There are two essential things missing from this:

  1. At the end of every data sequence, should be an Action.
    If there isn’t, what are we even attempting to do?
    Wisdom – should lead to action. A visualization or email alert should prompt Action. But there MUST always be action.
  2. At every stage, there is feedback. It’s a cycle not a mythical pyramid or promised land.
    I’ve never met anyone working with data, that didn’t find something out at a later stage that meant having to go back and rework their previous steps.
    e.g.

    1. Looking at the average height of males, The United States shows 5.5m, oops I guess I better go back and interpret that as feet instead of metres.
    2. Based on analysis, you tried emailing a subset of customers that should have converted to paying customers at 5% rate, but they didn’t. So based on action, you discovered you were wrong. Time to go back to the start and examine why.

Therefore the diagram should look more like this:

Data Cycle

You start with data, you reach Action but at any stage, including after action you can loop back to earlier stages in the cycle.

I’ve purposely blurred out the steps because it doesn’t matter what’s inbetween. Inbetween should be whatever gets your team to the action quickest with the acceptable level of risk. Notice this is the SDLC software development lifecycle. Software people spent years learning this lesson and it’s still an ongoing effort to make it a proper science.

What do you think? Am I wrong?

 

 

Pulse – adds metrics and QuestDB tutorial

We’ve added a new tutorial and demo, creating a crytpo dashboard with QuestDB backend:
questdb database cryto dashboard

Pulse – Real-time interactive Dashboards 0.14.1 adds a Metrics Panel.
Allows tracking headline text while still showing the trend as a background chart.

qStudio 2.0

qStudio recently celebrated it’s tenth birthday and it’s still continuing to be the main IDE for many kdb+ developers. We want to keep making it better. Version 2.0 now includes

If you’ve been using qStudio, we would love to hear your feedback, please get in touch.

DuckDB SQL

Download qStudio

qStudio 1.42 Release – Numerous bugfixes and improvements

Download the latest qStudio now.

qStudio Improvements

  • Bugfix Sending empty query would cause qStudio to get into bad state.
  • Default to chart NoRedraw when first loaded to save memory/time.
  • Preferences Improvements
    • Option to allow saving Document with windows \r\n or linux \n line endings. Settings -> Preferences… -> Misc
    • Allow specifying regex for folders that should be ignored by the “File Tree” window and Autocomplete
  • Add copy “hopen `:currentServer” command button to toolbar.
  • Ctrl+p Shortcut – Allow opening folders in explorer aswell as files.
  • Smarter Background Documents Saving (30 seconds between saves on background thread)

sqlDashboards Improvements

  • Allow saving .das without username/password to allow sharing. Prompt user on file open if cant connect to server.
  • Bugfix: Allow resizing of windows within sqlDashboards even when “No table returned” or query contains error.
  • If query is wrong and missing arg or something, report the reason.
  • Stop wrapping JDBC queries as we dont want kdb to use the standard SQL handler. We want to use the q) handler.

Open Source Alternative to kdb?

I often get asked what open source alternatives are there to kdb+. The answer depends on what you are trying to do. IF there was a product XYZ that provided some similar features, whether it can replace kdb depends on a few issues:

>>”What will XYZ bring us that kdb doesn’t?”
Kdb has been tried and tested over many computer/man-years. The KX team have fixed 1000’s of edge cases, optimization issues and OS specific bugs. Any similar system would have to replicate a lot of that work. Possible but it would take time and teams actually using it. It would also require a corporate entity to provide support and bug fixes together with long term guarantees of availability (not a few part-time committers on github). Ontop of that it would need to deliver more value to make it worth switching.

Kdb is both a database and a programming language and it’s that combination which I believe gives kdb it’s unique power:
– There is no open source database that provides the speed kdb provides for the particular queries suited to finance.
– Combining kdb and basing queries on q-sql/ordered lists (rather than set theory for standard sql) means queries require fewer lines of code. I believe this expressiveness combined with longer term use of kdb/q changes how you think and allows easily forming queries which many people couldn’t begin to write in standard sql.
– However as much as I think q is a selling point of kdb, I know many others would disagree. It takes a reasonable period of time to convince someone non-standard SQL is beneficial.

What is your use case? e.g. Example Queries to Consider:

1. Select top N by category
http://stackoverflow.com/questions/176964/select-top-10-records-for-each-category
select n#price by sym from trade

2. Joining Records on nearest date time:
http://www.bigresource.com/MS_SQL-joining-records-by-nearest-datetime-XsKMeH3t.html
aj[`sym`time;select .. from trade where ..;select .. from quote]

3. Queries dependent on order. (eg price change, subtract row from previous)
http://stackoverflow.com/questions/919136/subtracting-one-row-of-data-from-another-in-sql
select price-prev price from trade....

XYZ would need to support these queries well. Why would I chose XYZ instead of Python/R/J/A+?
Existing (some similar languages) that offer a larger existing user base, more libraries and a proven/stable platform. Unless a way is found to leverage existing languages/libraries XYZ will be competiting for attention against kdb and also python/numpy/julia etc.

>>”bring in the cost factor and should XYZ be considered as a big future player?”
For the target market of kdb the cost is often not the most significant factor in the decision. If kdb can answer questions that other platforms can’t or in a much shorter time, it often adds enough value to make the cost irrelevant. In fact many large firms are happy paying a pricey support agreement for free open source software so that they have someone to (blame) call to resolve an issue quickly.

>>”but could XYZ catch up and begin to be trusted by bigger institutions?”
If XYZ started to be able to answer the three example queries shown above at a reasonable speed multiple perhaps but I consider it unlikely. Kdb is entrenched and for its target use case it is currently unbeatable. Some people may have use cases that don’t need the full power of database and language combined or have other important factors (cost,existing expertise). I think those use cases have viable open source solutions.

Julia Programming Language

Julia programming language is being touted as the next big thing in scientific programming. It’s high-level like R/Python but meant to be much faster due to its smart compiler. I’ve been giving it a bit of a tryout, as part of learning it I’ve generated a list of all julia functions and will be creating examples for some of the more popular ones.

kdb+ Twitter Data Feed Now Open Sourced

Free kdb+ Twitter Feedhandler

twitter-kdb-logo

Previously we showed a demo of us getting data from twitter into kdb, we are now open sourcing part of that work, allowing you to quickly get some real social data into kdb to play with.

If you want to try running the kdb twitter data feed visit our https://github.com/timeseries and see the twitter-kdb project. You can even download the jar straight from our releases page. Here’s an image of the command line version running:

twitter-kdb

You will need to setup API keys for twitter

First Derivatives buy a majority stake in KX

First Derivatives buys €36m majority stake in Kx

Newry-based financial software firm First Derivatives has acquired a majority shareholding in big data analytics company Kx Systems for £36m (€44m).

KX has historically had a hard time penetrating markets outside finance, FD have a good sales team and previously acquired a marketing company in Philadelphia, hopefully this is the chance for kdb+ to go mainstream.

However it’s a worry that FD (First Derivatives) may increasingly “encourage” purchasing of the delta platform bundle rather than stand-alone kdb+. With the smaller margins outside of finance, will FD take a risk and open up the database. (Would FD have been a supporter of the 32-bit version becoming free for commercial use?) There’s a large number of individuals in off-shore locations that want to learn kdb+, FD could be incentivized to discourage that as it would hurt their consulting business.

It’s also interesting to consider companies that have already invested in KX technology and whether they will continue to do so

  • Competing consulting firms that specialise in kdb+ won’t take this as good news.
  • Panopticon/Datawatch based their visualization system on kdb+ (OEM license), they probably regret that now, given that their visualization software directly competes with FD’s dashboards.
  • Companies that had used kdb+ as part of their trading platform stack may consider FD a competitor as they also offer a trading platform.

What do you think? Will this lead to wider adoption? a growing platform?

sql Dashboards 1.31 Released – Interactive Forms

sqlDashboards is a tool for creating real-time sql based charts.
The latest 1.31 release is available to download.

Our new “Forms” now make the charts interactive. here’s an example dashboard showing some stock data, notice the form in the top left allowing selecting a stock ticker and the number of days data to show. On the bottom right is another form containing checkboxes for each country. We can change these selections and the charts will be updated instantly.

sql-dashboards-sql-chart-form

Now I modify my selection to ask for more days data, and untick some country checkboxes, to alter which ones are shown in the pie chart, the charts update straight away to give this:

sql-dashboards-sql-chart-form-2

Full details on how to create forms can be found in the sqlDashboards help.
If you use kdb see our “help menu”->Open example kdb dashboard option.

Multiple worksheets and Full Screen

Customers had asked us for to allow multiple worksheets, it’s now been added. As well as a new Full Screen mode to maximise display use when not editing the dashboard, the options can be found here:

sql dashboards full screen worksheets

We are going to continue to add new functionality…increasingly configurable charts, command line chart generation, web interfaces…if you have any features you would like added please get in touch, we are always happy to receive feedback.

kdb+ London User Group Meeting 2013

The Annual kdb+ London User Group Conference has been scheduled for Tuesday November 12th 2013. The speaker Line up includes:

  • Simon Garland – Kx -Things you might have missed in 3.1
  • NYSE Technologies – Tick as a Service & Data Dispatch
  • David Fallon – Credit Agricole – Using kdb+ for FX Trade Analysis
  • Andy Wisbey –First Derivatives plc – Using kdb+ for Real-Time Surveillance

FD kdb+ Consultants

  • Efficient portfolio analysis using linked columns in kdb+
  • Permissions in kdb+
  • Multi-threading in kdb+: performance optimisations and use cases