Developer Guide

Website is deployed in Kubernetes cluster. A deployment contains the following containers:

website: A Flask app with static files complied by Webpack.
mixer: A Data Commons API server.
esp: Google Extensive Service Proxy used for endpoints management.

mixer is a submodule of this Git repo. The exact commit of the submodule is deployed together with the website so it may not be the same version as in https://api.datacommons.org/version. Make sure to update and track the mixer changes for a new deployment:

git submodule foreach git pull origin master
git submodule update --init --recursive

Local Development with Flask

For changes that do not test GCP deployment or involve mixer changes, one can simply run in a local environment (Mac or Linux machine). This way the local Flask app talks to the autopush mixer.

Note: the autopush mixer contains the latest data and mixer code changes. It is necessary to update the mixer submodule if compatibility is required between website and mixer changes.

Prerequisites

WARNING: Make sure to go through each of the following steps.

Python 3.11

Confirm the Python3 version is 3.11.x. Otherwise install/upgrade your Python and confirm the version:
```
python3 --version
```
Node.js 18.4.0

Install nodejs and nvm. Run the following command to use Node.js 18.4.0:
```
nvm install 18.4.0
nvm use 18.4.0
```
Protoc 3.21.9

Install protoc at version 3.21.9.
[Optional] gcloud

gcloud is required to make the place search working locallly. This requires installation of gcloud.

Then ask Data Commons team to grant you permission for the Google Maps API key access.

Finally authenticate locally with
```
gcloud auth application-default login
```

Package javascript and static assets

./run_npm.sh

This will watch static files change and re-build on code edit.

NOTE: On macOS machines with a M1 chip, run the following command before running the above command. See this for more details.

brew install pkg-config cairo pango libpng jpeg giflib librsvg

Start the Flask Server

Start the flask webserver locally at localhost:8080

./run_server.sh

If you don't have access to DataCommons maps API, can bring up website without place search functionality

./run_server.sh -e lite

There are multiple environments for the server, specified by -e options. For example, custom is for custom data commons and iitm is for iitm data commons.

To start multiple instances, bind each server instance to a different port. The following example will start localhost on port 8081. The default is 8080.

Please note the strict syntax requirements for the script, and leave a space after the flag. So: ./run_server.sh -p 8081 but not ./run_server.sh -p=8081.

To enable language models

./run_server.sh -m

Start NL Server

Natural language models are hosted on a separate server. For features that depend on it (all NL-based interfaces and endpoints), the NL server needs to be brought up locally (in a separate process):

./run_nl_server.sh -p 6060

By default the NL server runs on port 6060.

Use Local Mixer

If local mixer is needed, can start it locally by following this instruction. This allows development with custom BigTable or mixer code change. Make sure to also run ESP locally.

Then start the Flask server with -l option to let it use the local mixer:

./run_server.sh -l

Deploy local changes to dev insance in GCP

Commit all changes locally, so the local change is identified by a git hash. Then run

gcloud auth login
gcloud auth configure-docker
./scripts/push_image.sh
./scripts/deploy_gke_helm.sh -e dev

The script builds docker image locally and tags it with the local git commit hash at HEAD, then deploys to dev instance in GKE.

View the deployoment at link.

Run Tests

Install web browser and webdriver

❗IMPORTANT: Make sure that your ChromeDriver version is compatible with your local Google Chrome version.

Before running the tests, install the browser and webdriver. Here we recommend you use Google Chrome browser and ChromeDriver.

Chrome browser can be downloaded here.
ChromeDriver can be downloaded here, or you can download it using package manager directly:
```
npm install chromedriver
```

You can view the latest ChromeDriver version here. Also make sure PATH is updated with ChromeDriver location.

If using Linux system, you can run the following commands to download Chrome browser and ChromeDriver, this will also include the path setup:

wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
sudo dpkg -i google-chrome-stable_current_amd64.deb; sudo apt-get -fy install
CHROMEDRIVERV=$(curl https://chromedriver.storage.googleapis.com/LATEST_RELEASE)
wget https://chromedriver.storage.googleapis.com/${CHROMEDRIVERV}/chromedriver_linux64.zip
unset CHROMEDRIVERV
unzip chromedriver_linux64.zip
sudo mv chromedriver /usr/bin/chromedriver
sudo chown root:root /usr/bin/chromedriver
sudo chmod +x /usr/bin/chromedriver

❗ NOTE: If using MacOS with an ARM processor (M1 chip), run local NL server before running the tests:

./run_nl_server.sh -p 6060

Run all tests

./run_test.sh -a

Update React test snapshots

cd static
npm test . -- -u

Other Developing Tips

Deploy latest code/data

The autopush instance(autopush.datacommons.org) always has the latest code and data. For this to happen in other dev/demo instance, in a clean git checkout, simply run:

./script/deploy_latest.sh <ENV_NAME>

Debug Flask in Visual Studio Code

[Optional] Update variables in 'env' of 'Flask' configurations in .vscode/launch.json as needed.
In the left hand side menu of VS Code, click on "Run and Debug".
On top of the "Run and Debug" pane, select "DC Website Flask" and click on the green "Play" button.
In "DEBUG CONSOLE" (not "TERMINAL"), check the server logs show up.

This brings up Flask server from the debugger. Now you can set break point and inspect variables from the debugger pane.

TIPS: you can inspect variable in the botton of "DEBUG CONSOLE" window.

A full tutorial of debugging Flask app in Visual Studio Code is in here.

Add new charts in Place Page

Update server/config/chart_config/<category>.json with the new chart.

   {
     "category": "", // The top level category this chart belongs to. Order of charts in the spec matters.
     "topic": "",  // Strongly encouraged - A page-level grouping for this chart.
     "titleId": "", // Strictly for translation purposes.
     "title": "", // Default (EN) display string
     "description": "", // Strictly for translation purposes.
     "statsVars": [""], // List of stat vars to include in the chart
     "isOverview": true, // Optional - default false. If the chart should be added to the overview page.
     "isChoropleth": true, // Optional - default false. If a map should be used to display the data
     "unit": "",
     "scaling": 100,
     "relatedChart": {  // Defined if there should be comparison charts added
       // All chart fields from above can be specified. If unspecified, it will be inherited.
     }
   }

Update related files.
- If adding a new category, create a new config file in server/chart_config and add the new category to:
  - static/js/shared/util.ts
  - server/__init__.py
- If a new stat var is introduced, also update:
  - Labels that appear as chips under comparison charts: static/js/i18n/strings/en/stats_var_labels.json
  - Titles on ranking pages: static/js/i18n/strings/en/stats_var_titles.json
  - New stat vars which have not been cached: NEW_STAT_VARS
- If a new unit is required, update:
  - static/js/i18n/i18n.tsx
  - static/js/i18n/strings/*/units.json (with display names and labels for the unit in ALL languages)
Note: Please add very detailed descriptions to guide our translators. See localization.md for more details.

Run these commands:

./scripts/extract_messages.sh
./scripts/compile_messages.sh

IMPORTANT: Manually restart Flask to reload the config and translations. Most likely, this means re-running run_server.py
Test the data on a place page!

Debugging Webdriver tests

Disable headless mode in webdriver to follow the test in Chrome. Chrome features like the dev inspector are available in this mode which is useful combined with sleep() to give you time to inspect the page. To enter this mode, comment out this line in base.py:
```
chrome_options.add_argument('--headless')
```
Another option is to save a screenshot at various points of the test:
```
self.driver.save_screenshot(filename)
```

GKE config

The GKE configuration is stored here.

Redis memcache

Redis memcache is used for production deployment. Each cluster has a Redis instance located in the same region.

Testing cloudbuild changes

To test .yaml cloudbuild files, you can use cloud-build-local to dry run the file before actually pushing. Find documentation for how to install and use cloud-build-local here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

developer_guide.md

developer_guide.md

Developer Guide

Local Development with Flask

Prerequisites

Package javascript and static assets

Start the Flask Server

Start NL Server

Use Local Mixer

Deploy local changes to dev insance in GCP

Run Tests

Install web browser and webdriver

Run all tests

Update React test snapshots

Other Developing Tips

Deploy latest code/data

Debug Flask in Visual Studio Code

Add new charts in Place Page

Debugging Webdriver tests

GKE config

Redis memcache

Testing cloudbuild changes

Files

developer_guide.md

Latest commit

History

developer_guide.md

File metadata and controls

Developer Guide

Local Development with Flask

Prerequisites

Package javascript and static assets

Start the Flask Server

Start NL Server

Use Local Mixer

Deploy local changes to dev insance in GCP

Run Tests

Install web browser and webdriver

Run all tests

Update React test snapshots

Other Developing Tips

Deploy latest code/data

Debug Flask in Visual Studio Code

Add new charts in Place Page

Debugging Webdriver tests

GKE config

Redis memcache

Testing cloudbuild changes