Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented ingestion #9

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Implemented ingestion #9

wants to merge 5 commits into from

Conversation

SanjeevLakhwani
Copy link

No description provided.

Copy link
Contributor

@v-rocheleau v-rocheleau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small requests, looks good otherwise! Nice use of pandas for validation!

Comment on lines 69 to 75
try:
df = df.applymap(lambda x: int(x) if pd.notna(x) else x)
except ValueError as e:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail=f"Data type error: Non-integer value found: {e}",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can let the parent try/except handle this exception, since we already handle ValueError there

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They both handle different cases of value error. the apply map raises a value error if the something from the map is incorrect datatype. while the outervalue error handles value error from loading the df which might be some random value error in the csv like maybe mixed datatypes in a column.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is the case but non-obvious, put a comment explaining it exactly as you did here.
in general if something is non-obvious in a PR and someone comments on it, it's good to put a comment. same goes for below.

transcriptomics_data_service/routers/ingest.py Outdated Show resolved Hide resolved
if df.index.duplicated().any():
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="Duplicate Gene IDs detected.")
if df.columns.duplicated().any():
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="Duplicate Sample IDs detected.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these HTTP exceptions are re-caught below as 500s, which is incorrect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants