Skip to content

Commit

Permalink
Update create_index_from_csv.md
Browse files Browse the repository at this point in the history
  • Loading branch information
kyleoconnell-NIH authored Feb 8, 2024
1 parent f3bcf22 commit 8b5f478
Showing 1 changed file with 15 additions and 2 deletions.
17 changes: 15 additions & 2 deletions docs/create_index_from_csv.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,25 @@
:sparkles: Here we outline how to create an Azure search index from a CSV file summarizing funded award data exported from Reporter.nih.gov

### 1) Generate input CSV
:ear: If you already have your csv ready, skip to (2).
:ear: If you already have your csv ready, skip to section (2)

Our input data comes from the csv export option for [Reporter.nih.gov](https://reporter.nih.gov/). Navigate to reporter.nih.gov and select `Advanced Search`. Input your search parameters. In this case we filtered for awards made by NIGMS in FY 23. In the top right, select `Export`.

Select your export columns and make sure you export as a csv. In the example input data file we only selected 'Title', 'Project_ID', and 'Total_Cost', although a few other columns were also exported.

![Export from Reporter](/docs/images/1_export_from_reporter.png)
![Export from Reporter](/docs/images/1_export_reporter_csv.png)

If using the UI to upload, you need to make two small edits to the csv that gets exported. First, remove the extra comma at the end of each line. Second, replace the spaces in column names in the header row. You can do this using something like Python, or just do a find/replace in a text editor.

### 2) Import data into Azure blob storage
:ear: If you already added your data to blob storage skip to to section (3)

On the home page, navigate to `Storage Accounts`.

![Nav Storage Account](/docs/images/2_storage_accounts.png)






0 comments on commit 8b5f478

Please sign in to comment.