Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ISM functions and update tutorial #29

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

lgunsalus
Copy link
Collaborator

This pull request adds ISM scoring to motifs.

  1. Updated src/grelu/interpret/motifs.py:

    • Added calculate_ism_weight function to compute ISM weights across the scan motif DataFrame.
  2. Updated src/grelu/visualize.py:

    • Added _moving_average function for calculating moving averages of arrays.
    • Added plot_motifs_on_ism function to visualize motifs on ISM data.
  3. Updated docs/tutorials/5_variant.ipynb:

    • Added examples of using the new functions.

Potential todo: draw motif logos on plot.

Testing:

  • New functions have been tested in the updated tutorial.
  • Pre-commit checks have passed.


Args:
ism (np.ndarray): The ISM array
scan (pd.DataFrame): The motif scan DataFrame
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's specify which functions create ism and scan

rolling_max_ism = _moving_average(max_ism, n=rolling_buffer)

# Create plot
plt.figure(figsize=(10, 4), dpi=200)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we implement this in plotnine for consistency with the rest of the package?

@@ -317,3 +317,19 @@ def compare_motifs(
scan["foldChange"] = scan.alt / scan.ref
scan = scan.sort_values("foldChange").reset_index(drop=True)
return scan


def calculate_ism_weight(row: pd.Series, ism: np.ndarray) -> float:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, we average the ISM values over the motif without accounting for whether the shape of the ISM signal (i.e. relative score of different nucleotides at the position) matches that of the motif. Wouldn't it be better if we convolved the ISM matrix with the motif matrix - e.g. using np.convolve or scipy.signal.convolve?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants