This repo should be used as a template for new data science projects at Penn Medicine, especially for any project that is expected to push data back into the EMR
To use this repo you first need a python enviroment with cruft installed. To install cruft run
pip install cruft
or the equivalent for your environment.
The project name must be a valid postgres schema name, python module name, and nomad task name.
Choose a project / repo name:
- Short but does not abbreviate words
- All lower case
- No hyphens
- No underscores
Once cruft is installed you can run the following to create your new project:
cruft create https://github.com/pennsignals/dsdk_cookie
Once your project is created you should push it to github by making a new private repo and pushing it like:
cd {your-project-name}
git init
git add -A
git commit -m "first commit"
git branch -M main
git remote add origin [email protected]:pennsignals/{name}.git
git push -u origin main
Occasionally updates will be made to dsdk_cookie
which should be rolled into your project to keep it up to date. To check if your project is up to date run the following:
cruft check
If your project is out of date you can run the following to update it:
cruft update
For further instructions on how to use your project can be found in the readme.md
for your project at https://github.com/pennsignals/{your-repo-name}#readme
- Add
cruft check
to CICD in{{cookiecutter.name}}/.github/workflows
. Maybe as part of test.yml? Maybe it's own thing? - Add
.github/test.yml
to root of this repo and have it runcookiecutter .
to ensure that the template isn't broken when updates are made. See #3 - Update
{{cookiecutter.name}}/readme.md
with instructions on how to usecruft update
.