Skip to content

Commit

Permalink
Update PDFM tutorial (#3)
Browse files Browse the repository at this point in the history
* Update PDFM tutorial

* Update README
  • Loading branch information
giswqs authored Nov 27, 2024
1 parent f9ebdff commit d7e60b9
Show file tree
Hide file tree
Showing 3 changed files with 184 additions and 13 deletions.
6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
# GeoAI-Tutorials

A collection of Jupyter notebook examples for using GeoAI

## Population Dynamics Foundation Model

Tutorials for Using Google's [Population Dynamics Foundation Model](https://github.com/google-research/population-dynamics) (PDFM)

- [Predicting US Housing Prices with PDFM and Zillow Data](https://geoai-tutorials.gishub.org/PDFM/zillow_home_value/)
185 changes: 172 additions & 13 deletions docs/PDFM/zillow_home_value.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,23 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"**Predicting US Housing Prices at the Zip Code Level Using Google's Population Dynamics Foundation Model and Zillow Data**\n",
"\n",
"[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/opengeos/GeoAI-Tutorials/blob/main/docs/PDFM/zillow_home_value.ipynb)\n",
"\n",
"**Predicting US Housing Prices at the Zip Code Level Using Google's Population Dynamics Foundation Model and Zillow Data**\n",
"\n",
"## Useful Resources\n",
"\n",
"- [Google's Population Dynamics Foundation Model (PDFM)](https://github.com/google-research/population-dynamics)\n",
"- Request access to PDFM embeddings [here](https://github.com/google-research/population-dynamics?tab=readme-ov-file#getting-access-to-the-embeddings)\n",
"- Zillow data can be accessed [here](https://www.zillow.com/research/data/)"
"- Zillow data can be accessed [here](https://www.zillow.com/research/data/)\n",
"\n",
"## Acknowledgements\n",
"\n",
"This notebook is adapted from the [PDFM tutorial](https://github.com/google-research/population-dynamics/blob/master/notebooks/pdfm_earth_engine.ipynb). Credit goes to the authors of the PDFM tutorial.\n",
"\n",
"## Installation\n",
"\n",
"Uncomment and run the following cell to install the required libraries."
]
},
{
Expand All @@ -24,6 +32,13 @@
"# %pip install leafmap scikit-learn"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Import Libraries"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand All @@ -38,6 +53,13 @@
"from leafmap.common import evaluate_model, plot_actual_vs_predicted, download_file"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Download Zillow Data"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand All @@ -58,6 +80,13 @@
" download_file(zhvi_url, zhvi_file)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Process Zillow Data"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand All @@ -69,17 +98,63 @@
"zhvi_df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Request access to PDFM Embeddings"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"embeddings_file_path = \"data/zcta_embeddings.csv\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To request access to PDFM embeddings, please follow the instructions [here](https://github.com/google-research/population-dynamics?tab=readme-ov-file#getting-access-to-the-embeddings)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"if not os.path.exists(embeddings_file_path):\n",
" raise FileNotFoundError(\"Please request the embeddings from Google\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Load PDFM Embeddings"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"embeddings_file_path = \"data/zcta_embeddings.csv\"\n",
"zipcode_embeddings = pd.read_csv(embeddings_file_path).set_index(\"place\")\n",
"zipcode_embeddings.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Join Zillow and PDFM Data"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -109,6 +184,13 @@
"data = data.dropna(subset=[label])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Split Train and Test Data"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand All @@ -121,17 +203,44 @@
"\n",
"X_train, X_test, y_train, y_test = train_test_split(\n",
" X, y, test_size=0.2, random_state=42\n",
")\n",
"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Fit Linear Regression Model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Initialize and train a simple linear regression model\n",
"model = LinearRegression()\n",
"model.fit(X_train, y_train)\n",
"\n",
"# Make predictions\n",
"y_pred = model.predict(X_test)\n",
"\n",
"y_pred = model.predict(X_test)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Evaluate Linear Regression Model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"evaluation_df = pd.DataFrame({\"y\": y_test, \"y_pred\": y_pred})\n",
"# Evaluate the model\n",
"metrics = evaluate_model(evaluation_df)\n",
"print(metrics)"
]
Expand All @@ -142,7 +251,29 @@
"metadata": {},
"outputs": [],
"source": [
"plot_actual_vs_predicted(evaluation_df, xlim=(0, 3_000_000), ylim=(0, 3_000_000))"
"xy_lim = (0, 3_000_000)\n",
"plot_actual_vs_predicted(\n",
" evaluation_df,\n",
" xlim=xy_lim,\n",
" ylim=xy_lim,\n",
" title=\"Actual vs Predicted Home Values\",\n",
" x_label=\"Actual Home Value\",\n",
" y_label=\"Predicted Home Value\",\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![image](https://github.com/user-attachments/assets/286638f1-88b6-4327-a883-c4e4512fbcdb)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Fit K-Nearest Neighbors Model"
]
},
{
Expand All @@ -155,8 +286,22 @@
"model = KNeighborsRegressor(n_neighbors=k)\n",
"model.fit(X_train, y_train)\n",
"\n",
"y_pred = model.predict(X_test)\n",
"\n",
"y_pred = model.predict(X_test)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Evaluate K-Nearest Neighbors Model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"evaluation_df = pd.DataFrame({\"y\": y_test, \"y_pred\": y_pred})\n",
"# Evaluate the model\n",
"metrics = evaluate_model(evaluation_df)\n",
Expand All @@ -169,7 +314,21 @@
"metadata": {},
"outputs": [],
"source": [
"plot_actual_vs_predicted(evaluation_df, xlim=(0, 3_000_000), ylim=(0, 3_000_000))"
"plot_actual_vs_predicted(\n",
" evaluation_df,\n",
" xlim=xy_lim,\n",
" ylim=xy_lim,\n",
" title=\"Actual vs Predicted Home Values\",\n",
" x_label=\"Actual Home Value\",\n",
" y_label=\"Predicted Home Value\",\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![image](https://github.com/user-attachments/assets/e8cdcd23-1945-480b-896a-49e273716bf1)"
]
}
],
Expand Down
6 changes: 6 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
# GeoAI-Tutorials

A collection of Jupyter notebook examples for using GeoAI

## Population Dynamics Foundation Model

Tutorials for Using Google's [Population Dynamics Foundation Model](https://github.com/google-research/population-dynamics) (PDFM)

- [Predicting US Housing Prices with PDFM and Zillow Data](https://geoai-tutorials.gishub.org/PDFM/zillow_home_value/)

0 comments on commit d7e60b9

Please sign in to comment.