The Top Column-Oriented Databases Compared

A column

This is a list of the top commercial, financial and open source column-oriented / tick databases available.

Businesses are realizing a one size fits all isn't working for databases. With the increasing acceptance and widespread adoption of alternative data storage systems such as NOSQL, column-oriented databases now receive more attention and a number of major vendors have started to provide columnar storage as a value add to their existing databases.

We provide an overview of each below, in future articles we will dive into each in more detail:

Open Source Column-Oriented Databases

There are three available open source column databases, all were based on works of research groups that later saw commercial spinoffs. C-Store produced vertica, MonetDB spawned Vectorwise and LucidDB was DynamoBI. Each project has stagnated as the team around them moved on to commercial endeavours. Only MonetDB appears to be still actively developed, it's also the one that seems most feature complete as we'll see later.

ProductVendor (release year)DescriptionCompressionDownload
C-Store University Collaboration: Brown/Brandeis/Massachusetts University and MIT (wp)
2006
Open source column-oriented database produced as a joint research project optimized for reads. Mike Stonebraker from MIT went on to form vertica shown above. Number of interesting research papers available: Overview, compression. YesRLE, Null suppression, gzip, dictionary of strings. BSD source available (difficult to compile as relies on old gcc/BerkelyDB)
MonetDB Research Centre based in the Netherlands (wp)
1993
An early pioneering column data store whos technology has been imitated by others and directly lead to the actian/vectorwise commercial product. Extremely fast column-oriented database that can handle large amounts of data, however it's basis as a research project shows through in some frustrating aspects (areas of little research value can have outstanding issues for months). SomeDictionary of strings. Installers available for most platforms Mozilla Public License 1.1
LucidDB Was a research project. (wp)
2007
An open source project that DynamoBI attempted to commercialise but never really took off. Part java, part C++, only limited connectivity options are available but the architecture is clearly documented and looks good. Yes Available for most platforms Apache License

Financial Tick Databases

Product Vendor (release year)Description Column-Oriented*Not all column-oriented databases can be considered equal, there are in fact differing levels, of how column-orented a database is depending on how
  • it stores data
  • the query planner operates
  • results are materialised
CompressionDownload
kdb+
kdb Training
KX (wp)
1998
An early column-oriented database that has proven itself fast and capable of holding massive amounts of data, widely used in the finance industry. Provides it's own language vector based language q and offers a variant of sql specialised for order/time series based queries. A unique conciseness and consistency compared to other more monolithic databases as it was mostly created by one man, Arthur Whitney. High YesRecommended to use OS compression e.g. solaris/ZFS, gzip/zlib and proprietary compression format. also possible. Trial Version 32 bit Memory Limited
One Tick Database Onetick
2005
Column/Row oriented database targeted at the financial sector and specialised for tick data, created by Leonid Frants that had built a tick solution while at Goldman Sachs.   YesGzip plus proprietary compression format. Contact them for Trial.
eXtremeDB McObject
2001
A fast embedded, mostly in-memory database targeted for financial firms and time series data. It's raw API and ability to be embedded within a process makes it fast, however this means a higher configuration cost and learning curve to get started. hybrid YesGzip plus proprietary compression format. Evaluation Version requires registration

Commercial Column-Oriented Database Vendors

Product Vendor (release year)Description Column-Oriented*Not all column-oriented databases can be considered equal, there are in fact differing levels, of how column-orented a database is depending on how
  • it stores data
  • the query planner operates
  • results are materialised
Grid FrameworkCompressionDownload
InfiniDB Calpont (wp)
2000
MySQL compatible warehouse columnar engine that is multi-terabyte capable. Yes Share-Nothing Scaleout Yes Community Edition (single node limit)
Greenplum GoPivotal (wp)
2003
Hybrid Column/Row oriented database based on postgreSQL with many enhancements to allow efficient parallel execution over multiple machines. Medium Shared Nothing MPP Architecture YesAppend only tables. Supports zlib, quickLZ and Run Length Encoding Trial Version
Teradata Database Teradata (wp)
1979
One of the most longest established and largest suppliers of column-oriented databases with a full supporting stack of associated software. Continues to innovate and recently purchased kickfire a column-oriented database that used FPGA to accelerate SQL queries. Medium Share Nothing YesAutomatically chooses from among six types of compression: run length, dictionary, trim, delta on mean, null and UTF8. based on the column demographics. Express Edition Size limits vary by platform
Vectorwise/Paraccel Actian (wp)
2008
Modern "Database architected for the new bottleneck: Memory Access." Based on research around the open source monetDB and the X100 project including efficient memory handling and vectorized query execution (SIMD). Consistently scores highly in the TPC-H benchmarks. Hybrid - YesDictionary for strings, Proprietary speedy compression of numeric data. 30 day Trial requires signup
Sybase IQ SAP (wp)
1994
Mature column-oriented database by one of the first commercial vendors that has many deployments (2000+) and good tooling support. It may be showing it's age as I've heard reports it can struggle to handle very large amounts of data or be slower than newer entrants, however this is hearsay and Sybase version history shows a good ract record of feature updates. More details are available here. High Shared-Disk Architecture YesToken/Dictionary Express Edition 5GB Limit
Vertica HP (wp)
2005
A modern parallel column-oriented database designed to run on multiple commodity servers. Co-founded by database researcher Michael Stonebraker based on previous open-source / academic work on c-store. More on the vertica architecture can be found here. Yes Shared-Nothing YesLZO, Run Length Encoding, Delta Community Edition 3 Node / 1 TB / Feature Limits.

The major benchmark for analytical queries amongst these vendors is the TPC-H decision support database benchmark , you can download the benchmark and view past results. Vendors not listed that may be added to the table later include: Exasol, MS SQL Server ColumnStore, Infobright, IBM DB2.