Episcanner Downloader is a data downloader for the Episcanner application. It retrieves data related to diseases like dengue, zika and chikungunya and saves it in a specified directory with the formats csv, parquet or duckdb.
- Fetches data related to diseases from the Episcanner application
- Supports downloading data for specific diseases
- Saves downloaded data to a designated directory
To install Episcanner Downloader, follow these steps:
- Clone the repository:
git clone https://github.com/AlertaDengue/episcanner-downloader.git
- Navigate to the cloned directory:
cd episcanner-downloader
- Create a Conda environment using the provided YAML file:
conda env create -f conda/env-base.yaml
- conda activate episcanner-downloader
conda activate episcanner-downloader
- Install the dependencies using Poetry:
poetry install
- Create a virtual environment:
python -m venv env
- Activate the virtual environment:
source env/bin/activate
- Install the dependencies using Poetry:
poetry install
Before running Episcanner Downloader, make sure to set the required environment variables for connecting to the PSQL database. You can use the provided Makefile to create a .env file with the exported variables:
- Set the required environment variables for connecting to the PSQL database:
export EPISCANNER_PSQL_URI="postgresql://user:password@host:port/database"
- Create a .env file in the project root directory with the exported variables.
make dotenv
To use Episcanner Downloader, follow these steps:
- Open the python console or another python interpreter:
from scanner import Episcanner
scanner = EpiScanner(disease="dengue", uf="RJ", year=2024)
scanner.export("duckdb")
Replace uf
with the desired state (e.g., 'MG') and disease
with the specific disease you want to download ('dengue', 'chikungunya' or 'zika'). Specify the output_dir
on the export()
method to change where the data should be saved.
- In order to read the data, open the file using
duckdb
:
import duckdb
db = duckdb.connect("<$HOME>/episcanner/episcanner.duckdb")
db.execute("SELECT * FROM 'RJ' WHERE disease = 'dengue' AND year = 2024").fetchdf()
Replace <$HOME> with your actual home directory or use the output_dir
specified in the export method
Episcanner Downloader is licensed under the MIT License. See the LICENSE file for more details.