Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime error related to tensor size when running "analysis-gene_maps" #68

Open
fuerzhou opened this issue Mar 27, 2023 · 4 comments
Open

Comments

@fuerzhou
Copy link

Hi @ludvb,

Thank you for inventing the tool for upscaling ST data.

I came across this error after training the model when running analysis "analysis-gene_maps". ERROR : RuntimeError: The size of tensor a (30598626) must match the size of tensor b (23687921) at non-singleton dimension 0

[2023-03-27 17:04:21,118] ℹ : Running analysis "analysis-gene_maps"
[2023-03-27 17:06:43,048] ⚠ WARNING : FutureWarning (/Users/fengyuzhou/myenv/lib/python3.8/site-packages/xfuse/data/dataset.py:114): iteritems is deprecated and will be removed in a future version. Use .items instead.
[2023-03-27 17:08:03,224] ⚠ WARNING : FutureWarning (/Users/fengyuzhou/myenv/lib/python3.8/site-packages/xfuse/data/dataset.py:114): iteritems is deprecated and will be removed in a future version. Use .items instead.
[2023-03-27 17:08:26,770] 🚨 ERROR : RuntimeError: The size of tensor a (30598626) must match the size of tensor b (23687921) at non-singleton dimension 0              
Traceback (most recent call last):
  File "/Users/fengyuzhou/myenv/lib/python3.8/site-packages/xfuse/analyze/prediction.py", line 146, in _sample
    yield from _run_model(
  File "/Users/fengyuzhou/myenv/lib/python3.8/site-packages/xfuse/analyze/prediction.py", line 91, in _run_model
    sample = sample / torch.as_tensor(sizes).to(sample).unsqueeze(1)
RuntimeError: The size of tensor a (30598626) must match the size of tensor b (23687921) at non-singleton dimension 0

For the training part, since I am very new to deep learning models, I tried parameters with small values so that the model can be trained fast and I could get a sense of what the result looks like. I am not sure if the parameters used caused the problem. Here is my config file.

[xfuse]
# This section defines modeling options. It can usually be left as-is.
network_depth = 3
network_width = 8
gene_regex = "^(?!RPS|RPL|MT-).*" # Regex matching genes to include in the model. By default, exclude mitochondrial and ribosomal genes.
min_counts = 50 # Exclude all genes with fewer reads than this value.

[settings]
cache_data = true
data_workers = 8 # Number of worker processes for data loading. If set to zero, run data loading in main thread.

[expansion_strategy]
# This section contains configuration options for the metagene expansion strategy.
type = "DropAndSplit" # Available choices: Extra, DropAndSplit
purge_interval = 200 # Metagene purging interval (epochs)

[expansion_strategy.Extra]
num_metagenes = 4
anneal_to = 1
anneal_epochs = 1000

[expansion_strategy.DropAndSplit]
max_metagenes = 20

[optimization]
# This section defines options used during training. It may be necessary to decrease the batch or patch size if running out of memory during training.
batch = 1
batch_size = 4
epochs = 10
learning_rate = 0.0003
patch_size = 1000 # Size of training patches. Set to '-1' to use as large patches as possible.

[analyses]
# This section defines which analyses to run. Each analysis has its own subtable with configuration options. Remove the table to stop the analysis from being run.

[analyses.analysis-gene_maps]
# Constructs a map of imputed expression for each gene in the dataset.
type = "gene_maps"

[analyses.analysis-gene_maps.options]
gene_regex = ".*"
num_samples = 1
genes_per_batch = 10
predict_mean = true
normalize = false
mask_tissue = true
scale = 1.0
writer = "image"

[analyses.analysis-metagenes]
# Creates summary data of the metagenes
type = "metagenes"

[analyses.analysis-metagenes.options]
method = "pca"

[slides]
# This section defines the slides to use in the experiment. Covariates are specified in the "covariates" table. Slide-specific options can be specified in the "options" table.
[slides.section1]
data = "converted/data.h5"
[slides.section1.covariates]
section = 1

Could you please help with the issue? Thank you in advance!

Yuzhou

@mattobny
Copy link

I am experiencing the same issue. From what I can gather, the parameter normalize_size is coerced to true here:

for x in predict(
num_samples=num_samples,
genes_per_batch=len(genes_batch),
predict_mean=predict_mean,
normalize_scale=normalize,
normalize_size=True,
)

Causing this conditional statement to pass, which is where the error occurs:

if normalize_size:
_, sizes = np.unique(
np.searchsorted(idxs, label.cpu().numpy()), return_counts=True,
)
sample = sample / torch.as_tensor(sizes).to(sample).unsqueeze(1)

I am also confused, as I also set the normalize parameter to false in my config file. I cannot figure out how to debug this, as I do not know what the variable idxs represents, and hence do not know what sizes represents either. Unfortunately I am working in an environment where I cannot interactively debug (a shared cluster). Please advise, and thank you for this powerful package @ludvb

@ludvb
Copy link
Owner

ludvb commented Mar 30, 2023

Thank you for the reports! Is this error happening in all your experiments or only in a particular dataset? Unfortunately, I have not been able to reproduce it on my own data, so debugging this will be a bit tricky.

To provide some context, the sizes variable should capture the number of pixels in each predicted location, and the line where the error occurs is used to normalize out any size differences. When the scale option to the gene_maps analysis is set to 1.0 (the default), each location is the size of one pixel. Therefore, sizes should be a vector of ones, and normalization is essentially a noop in this case. Thus, as a quickfix, it should be safe to simply delete this line without affecting the analysis.

If I recollect correctly, the reason why we force size normalization in the gene maps analysis is that when 1/scale is fractional, the discretized locations may not be of the same size. To produce gene map images that don't have any weird banding-like artifacts, it is then necessary to normalize out those differences.

@AdrianaLecourieux
Copy link

Hi,
I have the same issue when i'm running analysis "gene_maps" :
The size of tensor a (39911940) must match the size of tensor b (26755201) at non-singleton dimension 0
@fuerzhou @mattobny Did you find a solution ?

My config file is :

[xfuse]
network_depth = 6
network_width = 16
min_counts = 50

[expansion_strategy]
type = "DropAndSplit"
[expansion_strategy.DropAndSplit]
max_metagenes = 50

[optimization]
batch_size = 2
epochs = 100
learning_rate = 0.0003
patch_size = 64

[analyses]
[analyses.metagenes]
type = "metagenes"
[analyses.metagenes.options]
method = "pca"

[analyses.gene_maps]
type = "gene_maps"
[analyses.gene_maps.options]
gene_regex = ".*"

[slides]
[slides.section1]
data = "xfuse-2023-04-20T10:48:25.418334/data.h5"
[slides.section1.covariates]
section = 1

@fuerzhou
Copy link
Author

@AdrianaLecourieux
As @ludvb suggested, deleting the line that caused the error helped with the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants