How ChatGPT generates SQL Queries

  1. ChatGPT is an artificial intelligence chatbot developed by OpenAI and released in November 2022.
  2. It has been trained on large internet data sets including stackoverflow and can answer technical queries.
  3. Within Pulse we have added a call out to the OpenAI API to allow users to request queries.
  4. ChatGPT allows specifying multiple paragraphs of context. e.g. "You are an SQL expert and will answer all requests from users with a select statement".
  5. Pulse knows your database schema and sends database type plus table and column names as part of the context.
  6. This allows ChatGPT to guess more accurately what query you are trying to form

Example ChatGPT / SQL Queries

In the video I requested the following queries:

As you can see the only really hard question is the last query and I slightly nudged it to give the correct answer by telling it what function to use.

Thoughts for Future

  1. The accuracy of SQL generation is massively better than for kdb+. ChatGPT regularly forgets to add data clauses or attempts to use AND instead of ,.
  2. ChatGPT hallucinates answers based on common schemas it has seen somewhere. e.g. It will yse trade/quote table and size/sym column even when our data set doesn't have those tables/columns. Feeding it our own schema helps avoid this 50% of the time.
  3. With Pulse/qStudio we have to make it work with whatever schema the end user may have. If instead we were deploying LLMs against just one standardised data set I think it could do a lot lot better. Imagine we trained it with 500 previously common user queries on this specific data set it would have a lot more context and less ability to go wrong. Additionally it would be easier to check that it was generating valid queries.

Right now ChatGPT is only useful to generate a starting point for a standard SQL query. Once the ability to train for specific data sets becomes widely available I think something like ChatGPT will become extremely valuable and widely used by the majority of people.

Thanks for watching our demo. Please download Pulse to give it a try for yourself.