-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Debug h5ad saving #580
Debug h5ad saving #580
Conversation
Codecov ReportAttention:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #580 +/- ##
==========================================
- Coverage 22.51% 22.13% -0.38%
==========================================
Files 165 166 +1
Lines 26991 28043 +1052
==========================================
+ Hits 6077 6208 +131
- Misses 20914 21835 +921 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Sichao, I made some comments. we can briefly discuss about it
def export_kmc(adata: AnnData) -> None: | ||
"""Save the parameters of kmc and delete the kmc object from anndata.""" | ||
kmc = adata.uns["kmc"] | ||
adata.uns["kmc_params"] = { | ||
"P": kmc.P, | ||
"Idx": kmc.Idx, | ||
"eignum": kmc.eignum, | ||
"D": kmc.D, | ||
"U": kmc.U, | ||
"W": kmc.W, | ||
"W_inv": kmc.W_inv, | ||
"Kd": kmc.Kd, | ||
} | ||
adata.uns.pop("kmc") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not add this to the place where kmc is saved to adata.uns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this way, we can keep a kmc
object when we run the Dynamo analysis. If we need it, we just read it from the adata.uns
.
Umap
object uses a different saving strategy, which means umap parameters will be saved instead of the object itself. My idea is that it is possible to avoid creating a umap in the pipeline even if we run the umap dimension reduction. Unless the user wants to perform inverse transform in a specific analysis like fate
, they don't need this Umap instance. While the creation of KMC
is inevitable if the user enables the kmc
method in tl.cell_velocities
. Since we have already created KMC
, we can keep it for future usage until saving to h5ad.
NCx, NCy = ( | ||
[vecfld_dict["NCx"][index] for index in vecfld_dict["NCx"]], | ||
[vecfld_dict["NCy"][index] for index in vecfld_dict["NCy"]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what will be the behaviors for this? will this reorder the x/y coordinates of the nullclines?
[vecfld_dict["NCx"][index] for index in vecfld_dict["NCx"]],
[vecfld_dict["NCy"][index] for index in vecfld_dict["NCy"]],
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will read the nullclines from the dictionary to form a list of x and y coordinates. The order should be the same. Here is a reference indicating regular dictionaries have kept their items in the same order that they were inserted into the underlying dictionary since python 3.6.
negative_sample_rate=params["umap_kwargs"]["negative_sample_rate"], | ||
init_pos=params["umap_kwargs"]["init_pos"], | ||
random_state=params["umap_kwargs"]["random_state"], | ||
umap_kwargs=params["umap_kwargs"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
umap_kwargs=params["umap_kwargs"]
will this lead to pass duplicated arguments to construct_mapper_umap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
params["umap_kwargs"]
may pass duplicate arguments. But this should not raise an error because in construct_mapper_umap
we use update_dict
.
negative_sample_rate=params["umap_kwargs"]["negative_sample_rate"], | ||
init_pos=params["umap_kwargs"]["init_pos"], | ||
random_state=params["umap_kwargs"]["random_state"], | ||
umap_kwargs=params["umap_kwargs"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
umap_kwargs=params["umap_kwargs"]
will this lead to pass duplicated arguments to construct_mapper_umap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
params["umap_kwargs"]
may pass duplicate arguments. But this should not raise an error because in construct_mapper_umap
we use update_dict
.
"average": average, | ||
"t": t, | ||
"prediction": prediction, | ||
# "VecFld": VecFld, | ||
"VecFld_true": VecFld_true, | ||
# "VecFld_true": VecFld_true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VecFld_true means the groundtruth vector field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will we need this in the pipeline?
"init_states": init_states, | ||
"init_cells": init_cells, | ||
"init_cells": list(init_cells), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why list only applies to init_cells instead of init_states?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
init_cells
can be an Index (so we convert it to a list) while init_states
is an array (a more compatible data type).
Some parts of adata still have incompatible data types. Including:
NCx
andNCy
In the future, we will use
import_h5ad
andexport_h5ad
for saving and loading processed data.