Skip to content

Commit

Permalink
Update create_index_from_csv.md
Browse files Browse the repository at this point in the history
  • Loading branch information
kyleoconnell-NIH authored Feb 8, 2024
1 parent 3a8e133 commit 77b3b6c
Showing 1 changed file with 36 additions and 0 deletions.
36 changes: 36 additions & 0 deletions docs/create_index_from_csv.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,42 @@ Click `Import data`

![Import Data](/docs/images/6_import_data.png)

Now fill out all the necessary parameters.
+ Data Source: Select `Azure Blob Storage`. New options will drop down.
+ Data source name: This can be anything, but go with something like `grant-data`.
+ Data to extrac: Select `Content and metadata`.
+ Parsing mode: Select `Delimited text`. Check the `First Line Contains Header` box and leave `Delimiter Character` as `,`.
+ Connection string: Click `Choose an existing connection` and navigate to your storage account and container.
+ Managed identity authentication: Leave as default.
+ Container name: Should be populated when you connect via Connection String, but otherwise just enter your container name here.
+ Blob folder: *Optional*, if you have a folder within the container with the file(s) you want to index, enter that path here.
+ Description: *Optional*.
+ If you get errors when trying to go to the next screen, make sure you don't have trailing commas in your csv, and there are not spaces in the header names. If this happens, fix those errors, re-upload to blob storage, and then try again!

![Connect to blog](/docs/images/7_connect_to_blob.png)

Skip ahead to `Customize target index`.
+ Give your index a name.
+ Make `Project_Number` your key.
+ Make sure the expected column names are present under fields. For the columns you expect to use, select `Retrievable` and `Searchable`. If you select all the columns you will just pay for indexing you are not using.

![Customize index](/docs/images/8_target_index.png)

Advance to `Create an indexer`, name your indexer, then click `Submit`.

![Create indexer](/docs/images/9_create_indexer.png)

Navigate to `Indexes` on the left panel and wait until your index shows as many documents as you have lines in your file. It will read 0 documents until it is finished indexing. The example 500 line csv takes about one minute.

![Check index](/docs/images/10_check_index.png)


And that is it! Now return to [the tutorial notebook to run queries against this csv using GPT-4]( /tutorials/notebooks/GenAI/LLM_query_csv.ipynb).








Expand Down

0 comments on commit 77b3b6c

Please sign in to comment.