-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add conditional variance #712
base: main
Are you sure you want to change the base?
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
latent_space_selection: | ||
Key or Keys which specifies the latent or feature space used for computing the conditional variance. | ||
A single key has to be a latent space in :attr:`~anndata.AnnData.obsm` or | ||
a gene in :attr:`~anndata.AnnData.var_names`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a feature in ..., because we might also store proteins/ATAC, etc.
Key or Keys which specifies the latent or feature space used for computing the conditional variance. | ||
A single key has to be a latent space in :attr:`~anndata.AnnData.obsm` or | ||
a gene in :attr:`~anndata.AnnData.var_names`. | ||
A set of keys has to be a subset of genes in :attr:`~anndata.AnnData.var_names`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type hinting doesn't say set, but list.
source: K, | ||
target: K, | ||
forward: bool = True, | ||
latent_space_selection: Union[str, list[str]] = "X_pca", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's not call it latent space. can also be raw space, e.g. gene space. Also, the types are not clear
mask = [var_name in latent_space_selection for var_name in self.adata.var_names] | ||
latent_space = self.adata[:, mask].X.toarray() | ||
else: | ||
raise KeyError("Unknown latent space selection.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we have a key error, we want to print what the wrong key is.
filter_value = source if forward else target | ||
opposite_filter_value = target if forward else source | ||
|
||
if isinstance(latent_space_selection, str): | ||
if latent_space_selection in self.adata.obsm: | ||
latent_space = self.adata.obsm[latent_space_selection] | ||
elif latent_space_selection in self.adata.var_names: | ||
latent_space = self.adata[:, latent_space_selection in self.adata.var_names].X.toarray() | ||
else: | ||
raise KeyError("Gene/Latent space not found.") | ||
elif type(latent_space_selection) in [list, np.ndarray]: | ||
mask = [var_name in latent_space_selection for var_name in self.adata.var_names] | ||
latent_space = self.adata[:, mask].X.toarray() | ||
else: | ||
raise KeyError("Unknown latent space selection.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make this a function (within the function)
) | ||
|
||
cond_var = [] | ||
for i in range(cond_dists.shape[1]): # type: ignore[union-attr] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we vectorize this?
|
||
batch_size = batch_size if batch_size is not None else len(df) | ||
func = self.push if forward else self.pull | ||
for batch in range(0, len(df), batch_size): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we actually do this ? :)
batch_size=batch_size, | ||
) | ||
if key_added is None: | ||
assert isinstance(out, pd.DataFrame) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check for some properties, e.g. no NaN
, non-negativity
@pytest.mark.parametrize("key_added", [None, "test"]) | ||
@pytest.mark.parametrize("batch_size", [None, 2]) | ||
@pytest.mark.parametrize("latent_space_selection", ["X_pca", "KLF12", ["KLF12", "Dlip3", "Dref"]]) | ||
def test_compute_variance_pipeline( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also check for raise Error
with wrong attributes.
Added function
compute_variance()
as well as a test, similar tocompute_entropy()
.