A column

Updated Feb 11, 2024

This is a list of the top commercial, financial and open source column-oriented / tick databases available.

Businesses are realizing a one size fits all isn't working for databases. With the increasing acceptance and widespread adoption of alternative data storage systems such as NOSQL, column-oriented databases now receive more attention and a number of major vendors have started to provide columnar storage as a value add to their existing databases.

Open Source Column-Oriented Databases

The very early 1993-2007 databases were based on works of research groups that later saw commercial spinoffs.
2010+ saw the arrival of a new wave of open source column databases typically used by web companies to storing and analysing user data.

ProductVendor (release year)DescriptionScoreLicense
DuckDB DuckDB Foundation
2018
An embeddable, in-process, column-oriented SQL OLAP RDBMS. OLAP version of sqlite. 8 MIT License
Clickhouse Started at yandex (wp)
2016
Very fast OLAP database with cloud version available. Started 10 years ago at Yandex to store the russian equivalent of google analytics. Open sourced in 2016. Commercialization began shortly after with some of the original russian developers moving to US to form company for cloud offering. 8 Apache License 2.0
Doris Started at Baidu (wp)
2017
Very fast OLAP database with cloud version available. Started at Baidu 9Chinese Google). Open sourced in 2017. ? Apache License 2.0
InfluxDB (wp)
2013
Originally built by startup for monitoring and alerting. Now specializing in time-series analysis and IoT. Provides an SQL-like language. 7 MIT License
Druid Started at metamarkets (wp)
2011
A distributed data store written in Java. Druid is designed to quickly ingest massive quantities of event data, and provide aggregated queries ontop. Historically it was only designed to store data in aggregate but increasingly has expanded to support full granularity. 7 Apache License 2.0
LucidDB Was a research project. (wp)
2007
An open source project that DynamoBI attempted to commercialise but never really took off. Part java, part C++, only limited connectivity options are available but the architecture is clearly documented and looks good. 2 Apache License
C-Store University: Brown/Brandeis/MIT (wp)
2006
An early open source column-oriented database produced as a joint research project optimized for reads. Mike Stonebraker from MIT moved on from c-store to commercialise vertica. 2
MonetDB Research Centre based in the Netherlands (wp)
1993
An early pioneering column data store whos technology has been imitated by others and directly lead to the actian/vectorwise commercial product. Extremely fast column-oriented database that can handle large amounts of data, however it's basis as a research project shows through in some frustrating aspects (areas of little research value can have outstanding issues for months). 0 Mozilla Public License 1.1

Benchmarks

As you can see, for certain queries, column-oriented databases are 100s of times faster.
Results reproduced from Mark Litwintschik's excellent article.

SetupTotal Query Time (lower = better)Note
kdb+/q & 4 Intel Xeon Phi 7210 CPUs1.04 
ClickHouse, 3 x c5d.9xlarge cluster4.06 
Clickhouse on DoubleCloud, s1-c32-m1285.77 
Redshift, 6-node ds2.8xlarge cluster8.03 
Vertica, Intel Core i5 4670K147.30 
Spark 1.6, 5-node m3.xlarge cluster w/ S32158.00NOT column oriented.
SQLite 3, Parquet & HDFS6342.00NOT column oriented.

Column Database Benchmarks

Clickbench results:

System & MachineRelative time (lower is better)Note
ClickHouse (c6a.metal, 500gb gp2):×1.59
SelectDB (c6a.metal, 500gb gp2):×1.88
ClickHouse (m5d.24xlarge):×2.15
StarRocks (c6a.metal, 500gb gp2):×2.16
Redshift (4×ra3.16xlarge):×2.20
DuckDB (c6a.metal, 500gb gp2):×2.74
MariaDB ColumnStore (c6a.4xlarge, 500gb gp2)†:×59.27
Druid (c6a.4xlarge, 500gb gp2)†:×150.50
PostgreSQL (c6a.4xlarge, 500gb gp2):×883.89NOT column oriented

Financial Tick Databases

Product Vendor (release year)Description
kdb+ KX (wp)
1998
An early column-oriented database that has proven itself fast and capable of holding massive amounts of data, widely used in the finance industry. Provides it's own language vector based language q and offers a variant of sql specialised for order/time series based queries. A unique conciseness and consistency compared to other more monolithic databases as it was mostly created by one man, Arthur Whitney.
One Tick Database Onetick
2005
Column/Row oriented database targeted at the financial sector and specialised for tick data, created by Leonid Frants that had built a tick solution while at Goldman Sachs.
eXtremeDB McObject
2001
A fast embedded, mostly in-memory database targeted for financial firms and time series data. It's raw API and ability to be embedded within a process makes it fast, however this means a higher configuration cost and learning curve to get started.

Commercial Column-Oriented Database Vendors

Product Vendor (release year)Description Column-Oriented*Not all column-oriented databases can be considered equal, there are in fact differing levels, of how column-orented a database is depending on how
  • it stores data
  • the query planner operates
  • results are materialised
Grid FrameworkCompressionDownload
SingleStore SingleStore
2012
Mixed database that tries to perform for both transactional and analytics queries. Yes Share-Nothing Scaleout Yes Cloud Trial
InfiniDB Calpont (wp)
2000
MySQL compatible warehouse columnar engine that is multi-terabyte capable. Yes Share-Nothing Scaleout Yes Community Edition (single node limit)
Greenplum GoPivotal (wp)
2003
Hybrid Column/Row oriented database based on postgreSQL with many enhancements to allow efficient parallel execution over multiple machines. Medium Shared Nothing MPP Architecture YesAppend only tables. Supports zlib, quickLZ and Run Length Encoding Trial Version
Teradata Database Teradata (wp)
1979
One of the most longest established and largest suppliers of column-oriented databases with a full supporting stack of associated software. Continues to innovate and recently purchased kickfire a column-oriented database that used FPGA to accelerate SQL queries. Medium Share Nothing YesAutomatically chooses from among six types of compression: run length, dictionary, trim, delta on mean, null and UTF8. based on the column demographics. Express Edition Size limits vary by platform
Vectorwise/Paraccel Actian (wp)
2008
Modern "Database architected for the new bottleneck: Memory Access." Based on research around the open source monetDB and the X100 project including efficient memory handling and vectorized query execution (SIMD). Consistently scores highly in the TPC-H benchmarks. Hybrid - YesDictionary for strings, Proprietary speedy compression of numeric data. 30 day Trial requires signup
Sybase IQ SAP (wp)
1994
Mature column-oriented database by one of the first commercial vendors that has many deployments (2000+) and good tooling support. It may be showing it's age as I've heard reports it can struggle to handle very large amounts of data or be slower than newer entrants, however this is hearsay and Sybase version history shows a good ract record of feature updates. More details are available here. High Shared-Disk Architecture YesToken/Dictionary Express Edition 5GB Limit
Vertica HP (wp)
2005
A modern parallel column-oriented database designed to run on multiple commodity servers. Co-founded by database researcher Michael Stonebraker based on previous open-source / academic work on c-store. More on the vertica architecture can be found here. Yes Shared-Nothing YesLZO, Run Length Encoding, Delta Community Edition 3 Node / 1 TB / Feature Limits.

The major benchmark for analytical queries amongst these vendors is the TPC-H decision support database benchmark , you can download the benchmark and view past results. Vendors not listed that may be added to the table later include: Exasol, MS SQL Server ColumnStore, Infobright, IBM DB2.