pip installable package to support Serenata de Amor and Rosie development.
Serenata_toolbox is compatible with Python 3.6+
$ pip install -U serenata-toolbox
If you are a regular user you are ready to get started after pip install.
If you are a core developer willing to upload datasets to the cloud you need to configure AMAZON_ACCESS_KEY and AMAZON_SECRET_KEY environment variables before running the toolbox.
We have plenty of them ready for you to download from our servers. And this toolbox helps you get them. Here some examples:
# without any arguments will download our pre-processed datasets and store into data/ folder
$ serenata-toolbox
# will download these specific datasets and store into /tmp/serenata-data folder
$ serenata-toolbox /tmp/serenata-data --module federal_senate chamber_of_deputies
# you can specify a dataset and a year
$ serenata-toolbox --module chamber_of_deputies --year 2009
# or specify all options simultaneously
$ serenata-toolbox /tmp/serenata-data --module federal_senate --year 2017
# getting help
$ serenata-toolbox --help
Another option is creating your own Python script:
from serenata_toolbox.datasets import Datasets
datasets = Datasets('data/')
# now lets see what are the latest datasets available
for dataset in datasets.downloader.LATEST:
print(dataset) # and you'll see a long list of datasets!
# and let's download one of them
datasets.downloader.download('2018-01-05-reimbursements.xz') # yay, you've just downloaded this dataset to data/
# you can also get the most recent version of all datasets:
latest = list(datasets.downloader.LATEST)
datasets.downloader.download(latest)
If the last example doesn't look that simple, there are some fancy shortcuts available:
from serenata_toolbox.datasets import fetch, fetch_latest_backup
fetch('2018-01-05-reimbursements.xz', 'data/')
fetch_latest_backup( 'data/') # yep, we've just did exactly the same thing
If you ever wonder how did we generated these datasets, this toolbox can help you too (at least with the more used ones — the other ones are generated in our main repo):
from serenata_toolbox.federal_senate.dataset import Dataset as SenateDataset
from serenata_toolbox.chamber_of_deputies.reimbursements import Reimbursements as ChamberDataset
chamber = ChamberDataset('2018', 'data/')
chamber()
senate = SenateDataset('data/')
senate.fetch()
senate.translate()
senate.clean()
The full documentation is still a work in progress. If you wanna give us a hand you will need Sphinx:
$ cd docs
$ make clean;make rst;rm source/modules.rst;make html
Firstly, you should create a development environment with Python's venv module to isolate your development. Then clone the repository and build the package by running:
$ git clone https://github.com/okfn-brasil/serenata-toolbox.git
$ cd serenata-toolbox
$ python setup.py develop
Always add tests to your contribution — if you want to test it locally before opening the PR:
$ pip install tox
$ tox
When the tests are passing, also check for coverage of the modules you edited or added — if you want to check it before opening the PR:
$ tox
$ open htmlcov/index.html
Follow PEP8 and best practices implemented by Landscape in the veryhigh strictness level — if you want to check them locally before opening the PR:
$ pip install prospector
$ prospector -s veryhigh serenata_toolbox
If this report includes issues related to import section of your files, isort can help you:
$ pip install isort
$ isort **/*.py --diff
Always suggest a version bump. We use Semantic Versioning – or in Elm community words:
- MICRO: the API is the same, no risk of breaking code
- MINOR: values have been added, existing values are unchanged
- MAJOR: existing values have been changed or removed
This is really important because every new code merged to master triggers the CI and then the CI triggers a new release to PyPI. The attemp to roll out a new version of the toolbox will fail without a version bump. So we do encorouge to add a version bump even if all you have changed is the README.rst — this is the way to keep the README.rst updated in PyPI.
If you are not changing the API or README.rst in any sense and if you really do not want a version bump, you need to add [skip ci] to you commit message.
And finally take The Zen of Python into account:
$ python -m this