kdb q Quirks
The below is a collection of issues that are commonly asked or are just plain odd
Contents
- Deleting a Namespace - you can't
- Spaces in symbols - cast from string
- Standard SQL issues
- Column Naming Errors - Reserved words and spaces
- SQL CASE WHEN statement - Vector conditional as substitute
- SUBSTR or concatenating string columns
- select * from t where OR AND COMMA - kdb prefers comma to AND
Deleting a Namespace
We cannot delete a namespace once it has been created, we can however delete variables from namespaces as shown below:
Spaces in symbols
Spaces between symbols will cause an interpreter error. If you want spaces or other special characters you should create an array of strings then cast to symbols. e.g.
Standard SQL issues
Column Naming Errors
In order to select from a table with problematic names involving spaces or reserved words, we have to use functional select. Functional select allows us to supply the column names as symbols. Which as we seen above we can cast from strings to handle any special characters.
SQL CASE WHEN statement
q-SQL does not have CASE-WHEN like standard SQL, it does however have a number of other language features that provide similar functionality in a more elegant way. The most simple cases can be replaced with dictionary lookup:
More complicated cases e.g. conditionally selecting which source column to pull data from depending on a clause will require the vector conditional. Vector conditional returns a result that depending on its first argument (a list of booleans) takes either values from its second parameter if true, otherwise its third parameter. Below we can see this used to return the latest date from two columns:
SUBSTR or concatenating string columns
For joining a constant string to a string column use
join-each-left ,\:"postfix"
or join-each-right "prefix",/:
For joining two string columns to each other, use join-both ,'
e.g.
The parentheses are needed to prevent the interpreter getting confused and thinking your specifying different columns.
select * from t where OR AND COMMA
Kdb uses comma to separate where clauses. AND is also a function in kdb but it will give significantly worse performance. AND causes each of the columns contained in the clauses to be fully scanned. For the standard comma's, kdb will evaluate the first clause, store what indices evaluated true and only check those column indices for the next conditional and so on. This is why kdb queries with the comma perform much faster.