created preparedata.py and edited README

pnnl · Jul 15, 2024 · 5b01430 · 5b01430
1 parent d7e6283
commit 5b01430
Show file tree

Hide file tree

Showing 2 changed files with 223 additions and 23 deletions.
diff --git a/viz/README.md b/viz/README.md
@@ -18,22 +18,16 @@ Note that the `OPFLOW` application is available in the `$EXAGO_INSTALL/bin` dire
 
 The above command will run a `OPFLOW` on the given network and generate an output file called `opflowout.json`. The `-gicfile` is an additional option one can provide to provide the file that has the geospatial coordinates (latitude/longitude) for the network. If the geospatial coordinates are not provided then OPFLOW draws the network as a circle. It is highly recommended that one provides the geospatial coordinate file as an input to display the network correctly on the map. The geospatial coordinate file should have the same format as used for the [Electric Grid Test Case Repository](https://electricgrids.engr.tamu.edu/) synthetic networks. 
 
-Copy over the `opflowout.json` file to the `viz/data` subdirectory. Next, run the python script `geninputfile.py` to load the JSON file in the visualization script.
+Copy over the `opflowout.json` file to the `viz/data` subdirectory. Then, run the python script `preparedata.py` to load the JSON file in the visualization script.
 ```
-python geninputfile.py opflowout.json
+python preparedata.py opflowout.json
 ```
 
-Note: If you have created the JSON file externally then simply copy it over in the `viz/data` subdirectory and run the `geninputfile.py` script using the above command.
+The script will convert the JSON file into generation.csv, bus.csv, and transmission_line.csv for database queries, and generate necessary files for visualization.
 
-## Preparing the database for query
-Behind the scenes, LLM translates natural language queries into SQL queries to retrieve data from a database. As the power grid network is a typical geospatial dataset, we choose PostgreSQl + PostGIS database for the convenience of conducting spatial queries. Please follow the steps below to set up your PostgreSQL database that contains the power grid network dataset. 
+Note: If you have created the JSON file externally then simply copy it over in the `viz/data` subdirectory and run the `preparedata.py` script using the above command.
 
-First, we need to convert the ExaGO output `.json` files to `.csv` files. The difference between the two data formats is that JSON stores attributes and values as dictionary pairs but CSV stores attributes and values as tables. You can write your own script for this conversion or use the provided script. 
-
-To use the provided script, first copy the ExaGO output `.json` file to the `viz/data` subdirectory and simply run the following script in the `viz/data` subdirectory (replace the example filename with your json filename). This will output three CSV files: `generation.csv`, `bus.csv`, and `tranmission_line.csv`.
-```
-python jsontocsv.py opflowout.json
-```
+Behind the scenes, the LLM translates natural language queries into SQL queries to retrieve data from a database. As the power grid network is a typical geospatial dataset, we choose PostgreSQl + PostGIS database for the convenience of conducting spatial queries.
 
 ## Installation
 
@@ -70,7 +64,7 @@ Before launching the visualization, one needs to install these packages. This ca
 > [!WARNING]  
 > Per https://github.com/pnnl/ExaGO/issues/129 `--legacy-peer-deps` is required as an argumnet to `npm install`. This will ideally be removed once the vizualization is no longer experimnetal.
 
-### Launch visualization
+#### Launch visualization
 To launch the visualization, run
 ```
 npm start
@@ -79,25 +73,25 @@ This will open a webpage with the visualization of the given network.
 
 
 The figures show the visualization of the synthetic electric grid. The data for developing this visualization was created by merging the synthetic dataset for the [Eastern](https://electricgrids.engr.tamu.edu/electric-grid-test-cases/activsg70k/), [Western](https://electricgrids.engr.tamu.edu/electric-grid-test-cases/activsg10k/), and [Texas](https://electricgrids.engr.tamu.edu/electric-grid-test-cases/activsg2000/) interconnects from the [Electric Grid Test Case Repository](https://electricgrids.engr.tamu.edu/)
-### 2D synthetic US western grid network display
+#### 2D synthetic US western grid network display
 ![](images/network_viz.PNG)
 
-### 2D synthetic US western grid tranmission line flow display
+#### 2D synthetic US western grid tranmission line flow display
 ![](images/flow_viz.PNG)
 
-### 2.5D synthetic US western grid network display with generation overlapped and doughnut chart for generataion mix
+#### 2.5D synthetic US western grid network display with generation overlapped and doughnut chart for generataion mix
 ![](images/generation_viz.PNG)
 
-### 2.5D synthetic US western grid displaying load profile by counties
+#### 2.5D synthetic US western grid displaying load profile by counties
 ![](images/load_viz.PNG)
 
-### 2.5D synthetic US western grid displaying network, flow, generation, and load
+#### 2.5D synthetic US western grid displaying network, flow, generation, and load
 ![](images/all_viz.PNG)
 
-### Demo 
+#### Demo 
 See [Here](../tutorials/demo1.ipynb) 
 
-### ChatGrid
+#### ChatGrid
 ChatGrid is a natural language query tool for ExaGO visualizations. It is powered by OpenAI GPT-3.5-Turbo and Langchain. ChatGrid allows users to query on ExaGO visualizations through natural language and returns text summaries and visual outputs as answers. The following flow chart shows the architecture design of ChatGrid.
 ![](images/chatgrid_arch.png)
 
@@ -108,7 +102,7 @@ ChatGrid is built upon the following services and tools.
 - [PostGreSQL](https://www.postgresql.org/download/) database
 - [Flask](https://flask.palletsprojects.com/en/2.3.x/) framework 
 
-### Installing the database
+#### Installing the database
 
 1. Download PostgreSQL database from this [link](https://www.postgresql.org/download/) and install it. 
 
@@ -130,15 +124,15 @@ ChatGrid is built upon the following services and tools.
     Open the `config.py` file in the `viz/backend` subdirectory and replace `YOUR_DATABASE_PASSWORD` and `YOUR_DATABASE_NAME`  with your own database password and database name.
 
 
-### Getting your OpenAI API key
+#### Getting your OpenAI API key
 ChatGrid uses GPT models from OpenAI to process natural language queries. To use LLMs from OpenAI, you first need to go to [OpenAI's Platform website](https://platform.openai.com) and sign in with an OpenAI account. Click your profile icon at the top-right corner of the page and select "View API Keys." Click "Create New Secret Key" to generate a new API key. 
 
 Open the `config.py` in the `viz/backend` subdirectory replace `YOUR_OPENAI_KEY` with your own OpenAI API key.
 
 
 <!-- data script -->
 <!-- installation: pip install -r requirements.txt in the backend directory-->
-### Launch backend
+#### Launch backend
 ChatGrid uses Flask to host the service of receiving user queries and returning the data output and text summaries to update the visualizations on the frontend. Please follow the steps below to run the backend server.
 
 1. Go to the `viz/backend` subdirectory and use the `pip install -r requirements.txt` command to install all the Python dependencies.
@@ -151,5 +145,5 @@ ChatGrid uses Flask to host the service of receiving user queries and returning
 Now open the chat window on the frontend, type your queries, and enjoy ChatGrid!
 ![](images/chatgrid_case.png)
 
-### Optional: model configuration 
+#### Optional: model configuration 
 If you would like to test different LLMs with ChatGrid, you can specify the `model_name="YOUR_LLM_MODEL"` in the `viz/backend/sqlchain.py` file. 
diff --git a/viz/data/preparedata.py b/viz/data/preparedata.py
@@ -0,0 +1,206 @@
+import pandas as pd
+import json
+from shapely.geometry import shape
+import csv
+import sys
+
+
+def getGeneration(data):
+    Geni = None
+    Gens = []
+    minPg = 1000.0
+    maxPg = 0.0
+    Pgcoal = 0.0
+    Pghydro = 0.0
+    Pgnuclear = 0.0
+    Pgng = 0.0
+    Pgsolar = 0.0
+    Pgwind = 0.0
+    Pgother = 0.0
+    Pgcoalcap = 0.0
+    Pghydrocap = 0.0
+    Pgnuclearcap = 0.0
+    Pgngcap = 0.0
+    Pgsolarcap = 0.0
+    Pgwindcap = 0.0
+    Pgothercap = 0.0
+
+    for feature in data['geojsondata']['features']:
+        if feature['geometry']['type'] == 'Point':
+            subst = feature['properties']
+            name = subst['NAME']
+            nbus = subst['nbus']
+            Pg = 0.0
+            Pcap = 0.0
+            gen_fuel = ''
+            ngen = 0
+            KV = []
+            for bus in subst['bus']:
+                KV.append(bus['BASE_KV'])
+
+                for gen in bus['gen']:
+                    Pg += gen['GEN_STATUS'] * gen['PG']
+                    Pcap += gen['GEN_STATUS'] * gen['PMAX']
+                    gen_fuel = gen['GEN_FUEL'].lower()
+
+                    if gen_fuel == 'wind':
+                        Pgwind += gen['GEN_STATUS'] * gen['PG']
+                        Pgwindcap += gen['GEN_STATUS'] * gen['PMAX']
+                    elif gen_fuel == 'solar':
+                        Pgsolar += gen['GEN_STATUS'] * gen['PG']
+                        Pgsolarcap += gen['GEN_STATUS'] * gen['PMAX']
+                    elif gen_fuel == 'coal':
+                        Pgcoal += gen['GEN_STATUS'] * gen['PG']
+                        Pgcoalcap += gen['GEN_STATUS'] * gen['PMAX']
+                    elif gen_fuel == 'nuclear':
+                        Pgnuclear += gen['GEN_STATUS'] * gen['PG']
+                        Pgnuclearcap += gen['GEN_STATUS'] * gen['PMAX']
+                    elif gen_fuel == 'hydro':
+                        Pghydro += gen['GEN_STATUS'] * gen['PG']
+                        Pghydrocap += gen['GEN_STATUS'] * gen['PMAX']
+                    elif gen_fuel == 'ng':
+                        Pgng += gen['GEN_STATUS'] * gen['PG']
+                        Pgngcap += gen['GEN_STATUS'] * gen['PMAX']
+                    else:
+                        Pgother += gen['GEN_STATUS'] * gen['PG']
+                        Pgothercap += gen['GEN_STATUS'] * gen['PMAX']
+                    ngen += 1
+
+            if ngen:
+                color = ''
+                if gen_fuel == 'wind':
+                    color = 'green'
+                elif gen_fuel == 'solar':
+                    color = 'yellow'
+                elif gen_fuel == 'coal':
+                    color = 'gray'
+                elif gen_fuel == 'nuclear':
+                    color = 'red'
+                elif gen_fuel == 'hydro':
+                    color = 'blue'
+                elif gen_fuel == 'ng':
+                    color = 'orange'
+                else:
+                    color = 'black'
+
+                if Pg <= minPg:
+                    minPg = Pg
+                if Pg >= maxPg:
+                    maxPg = Pg
+                geo = shape(feature["geometry"])
+                Geni = {"coordinates": geo.wkt, "Power generated": Pg, "Power capacity": Pcap, "KVlevels": set(
+                    KV), "color": color, "generation name": name, "number of buses": nbus, "generation type": gen_fuel}
+                Gens.append(Geni)
+
+    keys = Gens[0].keys()
+    with open('generation.csv', 'w', newline='') as output_file:
+        dict_writer = csv.DictWriter(output_file, keys)
+        dict_writer.writeheader()
+        dict_writer.writerows(Gens)
+
+
+def getBus(data):
+    # f = open('case_ACTIVSg10k point.json')
+    # data = json.load(f)
+    points = []
+    for feature in data['geojsondata']['features']:
+        if feature['geometry']['type'] == 'Point':
+            geo = shape(feature["geometry"])
+            p = feature["properties"]
+            # kvlevels = p["KVlevels"].replace("[", "{")
+            # kvlevels = kvlevels.replace("]", "}")
+            points.append({
+                "wkt": geo.wkt,
+                "bus_name": p["NAME"],
+                "kilovolt levels": set(p["KVlevels"]),
+                "number of buses": p["nbus"],
+                "vm": p["Vm"],
+            })
+
+    keys = points[0].keys()
+    with open('bus.csv', 'w', newline='') as output_file:
+        dict_writer = csv.DictWriter(output_file, keys)
+        dict_writer.writeheader()
+        dict_writer.writerows(points)
+
+
+def getLine(data):
+    # f = open('case_ACTIVSg10k line.json')
+    # data = json.load(f)
+    lines = []
+    for feature in data['geojsondata']['features']:
+        if feature['geometry']['type'] == "LineString":
+            geo = shape(feature["geometry"])
+            p = feature["properties"]
+            # kvlevels = p["KVlevels"].replace("[", "{")
+            # kvlevels = kvlevels.replace("]", "}")
+            x = p["NAME"].split(' -- ')
+            if (p['PF'] > 0):
+                lines.append({
+                    "wkt": geo.wkt,
+                    "flow capacity": p["RATE_A"],
+                    "pf": p["PF"],
+                    "qf": p["QF"],
+                    "pt": p["PT"],
+                    "qt": p["QT"],
+                    "kilovolt": p["KV"],
+                    "line_name": p["NAME"],
+                    "srouce": x[0],
+                    "target": x[1],
+                    "actual flow": abs(p["PF"]),
+
+
+                })
+            else:
+                lines.append({
+                    "wkt": geo.wkt,
+                    "flow capacity": p["RATE_A"],
+                    "pf": p["PF"],
+                    "qf": p["QF"],
+                    "pt": p["PT"],
+                    "qt": p["QT"],
+                    "kilovolt": p["KV"],
+                    "line_name": p["NAME"],
+                    "srouce": x[1],
+                    "target": x[0],
+                    "actual flow": abs(p["PF"]),
+                })
+
+    keys = lines[0].keys()
+
+    with open('transmission_line.csv', 'w', newline='') as output_file:
+        dict_writer = csv.DictWriter(output_file, keys)
+        dict_writer.writeheader()
+        dict_writer.writerows(lines)
+
+def createModuleFile(filename):
+    with open('../module_casedata.js', 'w') as f:
+        f.write('// ExaGo Viz Input File\n')
+        f.write('\n')
+        f.write('module.exports = {\n')
+        f.write('\n')
+        f.write('\tget_casedata: function () {\n')
+        f.write('\t\t\t\tvar inputcasedata = require("./data/' + filename + '");\n')
+        f.write('\n')
+        f.write('\t\t\t\tvar casedata0 = {};\n')
+        f.write('\t\t\t\tcasedata0.geojsondata = {};\n')
+        f.write('\t\t\t\tcasedata0.geojsondata.type = "FeatureCollection";\n')
+        f.write(
+            '\t\t\t\tcasedata0.geojsondata.features = [...inputcasedata.geojsondata.features];\n')
+        f.write('\t\t\t\treturn casedata0;\n')
+        f.write('\t\t\t}\n')
+        f.write('\n')
+        f.write('};\n')
+
+def main():
+    filename = sys.argv[1]
+    createModuleFile(filename)
+    f = open(filename)
+    data = json.load(f)
+    getGeneration(data)
+    getBus(data)
+    getLine(data)
+
+
+if __name__ == "__main__":
+    main()