Archive for September, 2023

The Data Pyramid is a Lie

If you work with data, at some point you will be presented with a powerpoint similar to this:

Data Pyramid Lie

A wonderful fictional land, where we cleanly build everything on the layer below until we reach the heavens (In the past this was wisdom or visualization, increasingly it’s mythical AI).

There are two essential things missing from this:

  1. At the end of every data sequence, should be an Action.
    If there isn’t, what are we even attempting to do?
    Wisdom – should lead to action. A visualization or email alert should prompt Action. But there MUST always be action.
  2. At every stage, there is feedback. It’s a cycle not a mythical pyramid or promised land.
    I’ve never met anyone working with data, that didn’t find something out at a later stage that meant having to go back and rework their previous steps.
    e.g.

    1. Looking at the average height of males, The United States shows 5.5m, oops I guess I better go back and interpret that as feet instead of metres.
    2. Based on analysis, you tried emailing a subset of customers that should have converted to paying customers at 5% rate, but they didn’t. So based on action, you discovered you were wrong. Time to go back to the start and examine why.

Therefore the diagram should look more like this:

Data Cycle

You start with data, you reach Action but at any stage, including after action you can loop back to earlier stages in the cycle.

I’ve purposely blurred out the steps because it doesn’t matter what’s inbetween. Inbetween should be whatever gets your team to the action quickest with the acceptable level of risk. Notice this is the SDLC software development lifecycle. Software people spent years learning this lesson and it’s still an ongoing effort to make it a proper science.

What do you think? Am I wrong?

 

 

New Streaming Order Book Depth-map

We want to be the best finance streaming visualization solution. To achieve that, we can’t just use off the shelf parts, we have built our own market data order book visualization component from scratch, it’s only dependency is webgl. We call it DepthMap. It plots price levels over time, with the shading being the amount of liquidity at that level. It’s experimental right now but we are already receiving a lot of great feedback and ideas.

Faster Streaming Data
A lot of our users were capturing crypto data to a database, then polling that database. We want to remove that step so Pulse is faster and simpler. The first step is releasing our Binance Streaming Connection. In addition to our existing kdb streaming connection, we are trialling Websockets and Kafka. If this is something that interests you , please get in touch.