In this tutorial we will explain what symbols are and how we can enumerate symbols to enforce values to conform. By fully understanding what symbols are you will better be able to understand how to structure your database and queries optimally.

Symbols

Storing the same string e.g. "New York Stock Exchange" repeatedly in memory would require a significant amount of memory. Many languages including q intern strings so that only one copy of any distinct value is stored. Any other references to the same string are stored as just that, references. Symbols in kdb are interned strings where references point to a unique table of strings, as shown in the diagram:

What enumerated symbols are. An integer pointer to a constant string pool

This means symbols

  • Increase the time needed for creation as the position of the distinct string must be found
  • Reduce the space needed when the same values occur multiple places in the database
  • Significantly speed up equality comparisons
    where col1 in `a`b
    where col2=`tick

When to use Symbol/String

The decision is further complicated because kdb typically stores the string pool entirely in memory (that's what your sym file is in a typical kdb+tick database). You must be careful not to cause it to become too bloated as it is difficult to correct. Typically you should follow the guidelines:

  1. If the column is used in where clause equality comparisons e.g. select from t where sym in `A`B -> Symbol
  2. Short, often repeated strings -> Symbol
  3. Else Long, Non-repeated strings -> String

Enumeration

Similar to casting we can convert a list of values to a user definied enumeration to limit the values permitted. This is similar to enumerated types in C++, java and C#. Here is an example where we create a domain called suits, we then enumerate the list l against the domain suits in order to create the enumerated list el. Any appends or amends to el are then restricted to the enumerated type domain.

From the code you can see that trying to add a value not in the restricted domain fails. If we wanted to add a new suit called royals to our list we would have to first add it to our domain like so:

Foreign Keys

The ability to restrict the values of a list or column to a domain can be easily extended to restrict one column to link to another table and this is how kdb implements foreign keys.

On our 3 day kdb training course we look at this in much more detail and go through worked examples. We recommend it as the quickest way to learn kdb.