Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC] expand coding standards #78

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

emdupre
Copy link
Contributor

@emdupre emdupre commented May 13, 2024

Addresses #31, #62

  • To be discussed : Do we want to add recommendations on using LLMs in coding ?

@emdupre emdupre force-pushed the doc/coding-standards branch from 99db30b to 87fb4b8 Compare May 13, 2024 22:16
@emdupre
Copy link
Contributor Author

emdupre commented May 13, 2024

cc @sjshim, please let me know if you have any feedback on this ! 👩‍💻

@sjshim
Copy link
Contributor

sjshim commented May 13, 2024

This is definitely much better! Hoping that materials re:packaging will finalize more following next week's meeting?

@emdupre
Copy link
Contributor Author

emdupre commented May 13, 2024

Definitely will be good to get more ideas down following that discussion 😸

@poldrack, let us know if this looks OK (enough, for now) to merge on your end !

Comment on lines +81 to +82
assert all(df1.index == df2.index)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more of a unit test helper. Something like assert_matching_indices(dataframe1, dataframe2). A unit test would then be something like:

def test_dataframe_transformation():
    df = make_default_df()
    transformed_df = my_transformation(df)
    # Whatever else we do, do not break the index
    assert_matching_indices(df, transformed_df)


For projects that aim to develop pip-installable packages should follow current best-practices in Python Packaging.
As of May 2024, this is outlined in [this blog post](https://effigies.gitlab.io/posts/python-packaging-2023/) by lab member Chris Markiewicz.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +63 to +71
```

Compare this with a modular, portable refactoring:

```python
# load health data
def load_health_data(datadir, filename='health.csv'):
return pd.read_csv(os.path.join(datadir, filename), index_col=0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These don't quite do the same things. If you're going to say, don't do A, do B, it would be good if A and B produced the same result.

Do you want to add something like:

data = load_health_data(datadir)
demeaned = data[columns].dropna().mean(1)

Do you want to go into getting datadir from os.environ or sys.argv? Given the bullet points above, that might help make clear what the alternatives look like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants