Skip to content

Commit

Permalink
modified format slightly
Browse files Browse the repository at this point in the history
  • Loading branch information
kyleoconnell-NIH committed Mar 5, 2024
1 parent f2d1094 commit 6fdd900
Showing 1 changed file with 16 additions and 17 deletions.
33 changes: 16 additions & 17 deletions tutorials/notebooks/GenAI/notebooks/AzureAIStudio_sql_chatbot.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,24 @@
"# Creating a Chatbot for Structured Data using SQL"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Overview\n",
"**Generative AI (GenAI)** is a groundbreaking technology that generates human-like texts, images, code, and other forms of content. Although this is all true the focus of many GenAI techniques or implementations have been on unstructured data such as PDF's, text docs, image files, websites, etc. where it is required to set a parameter called *top K*. Top K utilizes an algorithm to only retrieve the top scored pieces of content or docs that is relevant to the users ask. This limits the amount of data the model is presented putting a disadvantage for users that may want to gather information from structured data like CSV and JSON files because they typically want all the occurrences relevant data appears. \n",
"\n",
"An example would be if you had a table that lists different types of apples, where they originate, and their colors and you want a list of red apples that originate from the US the model would only give you partial amount of the data you need because it is limited to looking for the top relevant data which may be limited to only finding the top 4 or 20 names of apples (depending on how you have configured your model) instead of listing them all. \n",
"\n",
"The technique that is laid our in this tutorial utilizes **SQL databases** and asks the model to create a query based on the ask of the user. It will then submit that query to the database and present the user with the results. This will not only give us all the information we need but will also decrease the chances of hitting our token limit."
]
},
{
"cell_type": "markdown",
"id": "431e4421-0b41-4a12-9811-0d7a030cf0f9",
"metadata": {},
"source": [
"### Objectives"
"## Learning Objectives"
]
},
{
Expand All @@ -32,7 +44,7 @@
"id": "3d2aa60a-cf87-4083-80fa-e9dc9179dcc8",
"metadata": {},
"source": [
"### Table of Contents"
"## Table of Contents"
]
},
{
Expand All @@ -51,22 +63,9 @@
},
{
"cell_type": "markdown",
"id": "83ce92a3-dff9-4a30-8f65-4c5b75349119",
"metadata": {},
"source": [
"### Summary <a id=\"summary\"></a>"
]
},
{
"cell_type": "markdown",
"id": "84a3fa39-a341-4a0c-a17d-a8c657759117",
"metadata": {},
"source": [
"**Generative AI (GenAI)** is a groundbreaking technology that generates human-like texts, images, code, and other forms of content. Although this is all true the focus of many GenAI techniques or implementations have been on unstructured data such as PDF's, text docs, image files, websites, etc. where it is required to set a parameter called *top K*. Top K utilizes an algorithm to only retrieve the top scored pieces of content or docs that is relevant to the users ask. This limits the amount of data the model is presented putting a disadvantage for users that may want to gather information from structured data like CSV and JSON files because they typically want all the occurrences relevant data appears. \n",
"\n",
"An example would be if you had a table that lists different types of apples, where they originate, and their colors and you want a list of red apples that originate from the US the model would only give you partial amount of the data you need because it is limited to looking for the top relevant data which may be limited to only finding the top 4 or 20 names of apples (depending on how you have configured your model) instead of listing them all. \n",
"\n",
"The technique that is laid our in this tutorial utilizes **SQL databases** and asks the model to create a query based on the ask of the user. It will then submit that query to the database and present the user with the results. This will not only give us all the information we need but will also decrease the chances of hitting our token limit."
"## Get Started"
]
},
{
Expand Down Expand Up @@ -505,7 +504,7 @@
"id": "116c547b-c569-4843-a6a9-e81c6c0f8252",
"metadata": {},
"source": [
"### Cleaning Up Resources <a id=\"cleanup\"></a>"
"## Clean Up <a id=\"cleanup\"></a>"
]
},
{
Expand Down

0 comments on commit 6fdd900

Please sign in to comment.