Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor snipping to enable trans, inter-arm, etc pileups #475

Open
wants to merge 21 commits into
base: master
Choose a base branch
from

Conversation

sergpolly
Copy link
Member

@sergpolly sergpolly commented Jan 15, 2024

Minimally invasive snipping refactor to enable pileups in trans:

  • user facing pileups behavior/API is unchanged !
  • internal API changes:
    • view_df is no longer optional in the snipper-classes, it's a positional argument now
    • internal _pileup require explicit feature_type (bed/bedpe) parameter now - looks cleaner that way
    • internal _extract_stack is now more specialized - work for bed features only, extract_stack_pairwise deals with bedpe
  • misc internal refactors:
    • all check removed from snipper-classes, moved to the user-facing pileups function
    • np.fmax trick for "on-diagonals" is now inside internal _pileups function
    • assign_view_auto -> combined_assignments_column=False -> no longer returns region column for BEDPE features
      • this was the first obstacle preventing trans pileups !
    • added more comments, where behavior seemed surprising
    • modified snipper classes:
      • min_diag doesn't require diagonal indicator matrix
      • expected - pre grouped and reindexed in the init
      • toeplitz expected works for arbitrary cis-region
      • removed old unused code for padding out of bound snippets - easier to read now
  • Updated snipping tests:
    • added 2 tests where we actually compare values to naive clr.matrix.fetch results !
    • fixed few windows/features - that were not view-annotated, but advertised as such
    • used pytest fixtures to define clr,exp,view_df that we reuse in most tests

TODO

  • run flake/ruff/black on code
    • existing .flake8 -> [ruff]-section of pyproject.toml using flake8-to-ruff, ran run format on files modified by the PR
  • add feature_type to user-API + default_cols pileup: bed vs. bedpe inference #427
  • add arbitrary cis region tests

Future Notes

  • the whole special treatment of BED-like/on-diagonal feature can probably be deprecated - i.e. on-diagonal should become a special case of bedpe, where region1==region2 - snipper-classes actually do that internally anyways - would yeild much cleaner code (same as in Refactoring snipping & pileups #227 (comment))
  • the whole lo/hi business in the main-user facing function is not really used anywhere - snipper classes end up recalculating it anyways - so either cut it off, or actually make snipper-classes make use of lo/hi columns
  • consider dealing with view-unannotated features separately e.g. in main pileups function

Misc Notes

  • side effect of this refactor is that if one provides inter-arm features to the pileup together with just an intra-arm expected - the script will crash/raise KeyError

@sergpolly sergpolly changed the title Refactor snipping to enable trans pileups Refactor snipping to enable trans, inter-arm, etc pileups Jan 26, 2024
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant