Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add filtering of invalid intensity values in log transformation, add message to warn user #557

Open
wants to merge 5 commits into
base: dev
Choose a base branch
from

Conversation

henninggaertner
Copy link
Collaborator

@henninggaertner henninggaertner commented Dec 2, 2024

Description

fixes #41
log(0) is infinite, which is not the desired outcome for such a value. Therefore, they need to be filtered and dropped from the data, while notifying the users about the dropped values and urging them to change their pre-processing pipeline.

Changes

Add filtering, dropping and message in transformation.py
Adjust and add tests in test_transformation.py

Testing

Create a run with both protein and peptide import. Then manually change some intensity values to be 0 or negative in the run's dataframe folder. Then do the log transformation with the "dirty" data.
You should see a message about the dropped data.

PR checklist

Development

  • If necessary, I have updated the documentation (README, docstrings, etc.)
  • If necessary, I have created / updated tests.

Mergeability

  • main-branch has been merged into local branch to resolve conflicts
  • The tests and linter have passed AFTER local merge
  • The code has been formatted with black

Code review

  • I have self-reviewed my code.
  • At least one other developer reviewed and approved the changes

Add message warning the user of filtered data in transformation.py.
@henninggaertner henninggaertner linked an issue Dec 2, 2024 that may be closed by this pull request
3 tasks
Copy link

github-actions bot commented Dec 2, 2024

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  protzilla
  run.py
  run_helper.py
  runner.py
  protzilla/data_analysis
  differential_expression_helper.py 125, 143
  differential_expression_kruskal_wallis.py 227
  differential_expression_linear_model.py
  differential_expression_mann_whitney.py 266
  differential_expression_t_test.py
  plots.py
  protein_graphs.py
  ptm_analysis.py 56, 123-125
  protzilla/data_integration
  di_plots.py
  protzilla/data_preprocessing
  outlier_detection.py
  plots.py
  protzilla/importing
  ms_data_import.py 122, 276
  protzilla/methods
  data_analysis.py 21, 174, 250, 283, 308, 351, 916, 935
  data_preprocessing.py
  importing.py 144
  protzilla/utilities
  transform_dfs.py
  utilities.py
  ui/main
  settings.py
  ui/runs
  fields.py
  views.py 21, 250, 284
  views_helper.py
  ui/runs/forms
  data_analysis.py 318-322, 379, 409, 445, 468, 1183, 1221, 1243
Project Total  

The report is truncated to 25 files out of 55. To see the full report, please visit the workflow summary page.

This report was generated by python-coverage-comment-action

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

#TODO log transform of 0 values in df
1 participant