<aside>
💡 These guidelines are specifically for ‘chat-with-your-data’ mode in Copilot. These guidelines are based on our experience with current LLM like Gpt-4o and Gpt-3.5 among others. Please share your experience and help us make it better.
</aside>
Important concepts
- The simple guiding principle — provide enough context to improve the accuracy of the answers. Unlike 'Google' you can specify multiple contextual sentences in the user prompt.
- Try to have a conversation: When the AI engine picks a wrong column or interprets the value incorrectly, nudge it with a small correction. Consider changing the Table description in the Table comment (SQL) so this information is captured for others using the DB.
- When dealing with a lot of tables in the selected DB/Schema, it is important to select the appropriate tables you think are most relevant. This allows the engine to provide the appropriate context to increasing the generated SQL accuracy. This is especially important when you have a number of Tables with names that are similar or semantically overlap.
- Relationships : ideally, your tables already have Foreign key constraints that capture the relationships. If not, you should specify these in your table description. Further described below. You should include the commonly used relationships in the Fact table descriptions.
- Exploit Foundational models prior knowledge: For instance, if you have a column called 'State' with values ('CA', 'NY'..) or Airport codes like ('SFO', 'LAX' ... ), there is a good chance the LLM already has this knowledge. Nothing required from you.
Increase Accuracy by following these Steps below
Step 1) Describe your Table and Column names
- The most important thing to remember is whether your schema's naming conventions align with how someone would refer to those entities in natural language. If they do, your work is significantly reduced.
- If the table and column names do not reflect the common terms that users use in natural language, you should include synonyms in your table description.
- For example, we use the term 'Service' to denote a customer-launched DB or DB cluster. Thus, we add the following synonyms using SQL :
Alter table Service comment "*Also known as Database or DB cluster*..... "
One simple way to get started to create a description for your table is using the Copilot.
For instance, you could ask :
select a few rows from <yourTableName>, develop an understanding. Then, output : (1) other common names for this table (2) a succinct 2-3 sentence description for this table.
Then, tweak this description. Also include other important semantics like its relationships.
Then, add this description as the Table comment. You have to use a SQL client to do this. Copilot is not authorized to do any mutation of your DB.