Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

195 update documentation and dependencies #196

Merged
merged 6 commits into from
Oct 10, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,9 +43,7 @@ Module names come from the names of the .py files containing function declaratio

- Try to create modules in a way that each module contains only one functionality. Split this functionality into two function declarations: one for external use and one (the core functionality) for internal use. See e.g. implementation of [clipping functionality](./eis_toolkit/raster_processing/clipping.py) for reference.

- For large or complex functionalities, it is okay to include multiple (helper) functions in one module/file. If you have a moderate amount of functions, you can put them in one file, but in case several helper functions are needed (and they are not general and don't belong in the utilities module), you can create a secondary file for your functionality, for example `clipping_functions.py` or `clipping_utilities.py` for `clipping.py`.

3. Functions
1. Functions

Name each function according to what it is supposed to do. Try to express the purpose as simplistic as possible. In principle, each function should be creted for executing one task. We prefer modular structure and low hierarchy by trying to avoid nested function declarations. It is highly recommended to call other functions for executing sub tasks.

Expand Down
72 changes: 5 additions & 67 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,77 +29,17 @@ If you are contributing by implementing new functionalities, read the **For deve

## For developers

### Prerequisites

All contributing developers need git, and a copy of the repository.

```console
git clone https://github.com/GispoCoding/eis_toolkit.git
```

After this you have three options for setting up your local development environment.
1. Docker
2. Python venv
3. Conda

Docker is recommended as it containerizes the whole development environment, making sure it stays identical across different developers and operating systems. Using a container also keeps your own computer clean of all dependencies.

### Setting up a local development environment with docker (recommended)
Build and run the eis_toolkit container. Run this and every other command in the repository root unless otherwise directed.

```console
docker compose up -d
```

If you need to rebuild already existing container (e.g. dependencies have been updated), run

```console
docker compose up -d --build
```

### Working with the container

Attach to the running container

```console
docker attach eis_toolkit
```

You are now in your local development container, and all your commands in the current terminal window interact with the container.

**Note** that your local repository gets automatically mounted into the container. This means that:
- The repository in your computer's filesystem and in the container are exactly the same
- Changes from either one carry over to the other instantly, without any need for restarting the container

For your workflow this means that:
- You can edit all files like you normally would (on your own computer, with your favourite text editor etc.)
- You must do all testing and running the code inside the container
1. Docker - [instructions](./instructions/dev_setup_with_docker.md)
2. Poetry - [instructions](./instructions/dev_setup_without_docker.md)
3. Conda - [instructions](./instructions/dev_setup_without_docker_with_conda.md)

### Python inside the container

Whether or not using docker we manage the python dependencies with poetry. This means that a python venv is found in the container too. Inside the container, you can get into the venv like you normally would

```console
poetry shell
```

and run your code and tests from the command line. For example:

```console
python <path/to/your/file.py>
```

or

```console
pytest
```

You can also run commands from outside the venv, just prefix them with poetry run. For example:

```console
poetry run pytest
```

### Additonal instructions

Expand All @@ -108,10 +48,8 @@ Here are some additional instructions related to the development of EIS toolkit:
- [Generating documentation](./instructions/generating_documentation.md)
- [Using jupyterlab](./instructions/using_jupyterlab.md)

If you want to set up the development environment without docker, see:
- [Setup without docker with poetry](./instructions/dev_setup_without_docker.md)
- [Setup without docker with conda](./instructions/dev_setup_without_docker_with_conda.md)

## For users
TBD when first release is out.

## License

Expand Down
6 changes: 3 additions & 3 deletions eis_toolkit/exploratory_analyses/k_means_cluster.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,17 +18,17 @@ def _k_means_clustering(
# The elbow method
k_max = 10
inertia = np.array(
[KMeans(n_clusters=k, random_state=0).fit(coordinates).inertia_ for k in range(1, k_max + 1)]
[KMeans(n_clusters=k, random_state=0, n_init=10).fit(coordinates).inertia_ for k in range(1, k_max + 1)]
)

inertia = np.diff(inertia, 2)
scaled_derivatives = [i * 100 for i in inertia]
k_optimal = scaled_derivatives.index(min(scaled_derivatives))

kmeans = KMeans(n_clusters=k_optimal, random_state=random_state)
kmeans = KMeans(n_clusters=k_optimal, random_state=random_state, n_init=10)

else:
kmeans = KMeans(n_clusters=number_of_clusters, random_state=random_state)
kmeans = KMeans(n_clusters=number_of_clusters, random_state=random_state, n_init=10)

kmeans.fit(coordinates)
data["cluster"] = kmeans.labels_
Expand Down
36 changes: 0 additions & 36 deletions eis_toolkit/exploratory_analyses/plot_pca.py

This file was deleted.

57 changes: 57 additions & 0 deletions instructions/dev_setup_with_docker.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
### Development with Docker

Build and run the eis_toolkit container. Run this and every other command in the repository root unless otherwise directed.

```console
docker compose up -d
```

If you need to rebuild already existing container (e.g. dependencies have been updated), run

```console
docker compose up -d --build
```

### Working with the container

Attach to the running container

```console
docker attach eis_toolkit
```

You are now in your local development container, and all your commands in the current terminal window interact with the container.

**Note** that your local repository gets automatically mounted into the container. This means that:
- The repository in your computer's filesystem and in the container are exactly the same
- Changes from either one carry over to the other instantly, without any need for restarting the container

For your workflow this means that:
- You can edit all files like you normally would (on your own computer, with your favourite text editor etc.)
- You must do all testing and running the code inside the container

### Python inside the container

Whether or not using docker we manage the python dependencies with poetry. This means that a python venv is found in the container too. Inside the container, you can get into the venv like you normally would

```console
poetry shell
```

and run your code and tests from the command line. For example:

```console
python <path/to/your/file.py>
```

or

```console
pytest
```

You can also run commands from outside the venv, just prefix them with poetry run. For example:

```console
poetry run pytest
```
4 changes: 2 additions & 2 deletions instructions/dev_setup_without_docker.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Development without docker
If you do not have docker, you can setup your local development environment as a python virtual environment.
# Development with Poetry
If you do not have docker, you can setup your local development environment as a python virtual environment using Poetry.

## Prerequisites

Expand Down
66 changes: 3 additions & 63 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 1 addition & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ keywords = [
]

[tool.poetry.dependencies]
python = ">=3.8,<3.11"
python = ">=3.9,<3.11"
gdal = "3.4.3"
rasterio = "^1.3.0"
pandas = "^1.4.3"
Expand All @@ -29,7 +29,6 @@ statsmodels = "^0.13.2"
keras = "^2.9.0"
tensorflow = "^2.9.1"
mkdocs-material = "^8.4.0"
plotly = "^5.14.0"
beartype = "^0.13.1"
seaborn = "^0.12.2"
pykrige = "^1.7.0"
Expand Down
4 changes: 1 addition & 3 deletions tests/conversions/raster_to_dataframe_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@

import numpy as np
import pandas as pd
import pytest
import rasterio

from eis_toolkit.conversions.raster_to_dataframe import raster_to_dataframe
Expand All @@ -11,7 +10,6 @@
test_dir = Path(__file__).parent.parent


@pytest.mark.skip
def test_raster_to_dataframe():
"""Test raster to pandas conversion by converting pandas dataframe and then back to raster data."""
raster = rasterio.open(SMALL_RASTER_PATH)
Expand All @@ -32,7 +30,7 @@ def test_raster_to_dataframe():
"""Convert back to raster image."""
df["id"] = df.index
long_df = pd.wide_to_long(df, ["band_"], i="id", j="band").reset_index()
long_df.loc[:, ["col", "row"]] = long_df.loc[:, ["col", "row"]].astype(int)
long_df = long_df.astype({"col": int, "row": int})
raster_img = np.empty((multiband_raster.count, multiband_raster.height, multiband_raster.width))
raster_img[(long_df.band - 1).to_list(), long_df.row.to_list(), long_df.col.to_list()] = long_df.band_

Expand Down
Loading
Loading