-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DOC] expand coding standards #78
base: main
Are you sure you want to change the base?
Conversation
99db30b
to
87fb4b8
Compare
cc @sjshim, please let me know if you have any feedback on this ! 👩💻 |
This is definitely much better! Hoping that materials re:packaging will finalize more following next week's meeting? |
Definitely will be good to get more ideas down following that discussion 😸 @poldrack, let us know if this looks OK (enough, for now) to merge on your end ! |
assert all(df1.index == df2.index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is more of a unit test helper. Something like assert_matching_indices(dataframe1, dataframe2)
. A unit test would then be something like:
def test_dataframe_transformation():
df = make_default_df()
transformed_df = my_transformation(df)
# Whatever else we do, do not break the index
assert_matching_indices(df, transformed_df)
|
||
For projects that aim to develop pip-installable packages should follow current best-practices in Python Packaging. | ||
As of May 2024, this is outlined in [this blog post](https://effigies.gitlab.io/posts/python-packaging-2023/) by lab member Chris Markiewicz. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might suggest https://www.pyopensci.org/python-package-guide/package-structure-code/python-package-build-tools.html as a more thorough guide.
``` | ||
|
||
Compare this with a modular, portable refactoring: | ||
|
||
```python | ||
# load health data | ||
def load_health_data(datadir, filename='health.csv'): | ||
return pd.read_csv(os.path.join(datadir, filename), index_col=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These don't quite do the same things. If you're going to say, don't do A, do B, it would be good if A and B produced the same result.
Do you want to add something like:
data = load_health_data(datadir)
demeaned = data[columns].dropna().mean(1)
Do you want to go into getting datadir
from os.environ
or sys.argv
? Given the bullet points above, that might help make clear what the alternatives look like.
Addresses #31, #62