Potential wrong row-standardization in lee.py #207

ChaozhongLiu · 2022-05-27T17:29:21Z

I could be wrong about this but it's confusing me for some time.

In spatial pearson statistic functions lee.py, line 83 and line 198, row-standardization is writen as:
standard_connectivity = self.connectivity / self.connectivity.sum(axis=1)

However, becaused the shape of self.connectivity.sum(axis=1) is (n,), when broadcasting it is row-wise division instead of column-wise.

Just by checking standard_connectivity.sum(axis=1) you will find it is not equal to 1. And this is problematic to the local spatial pearson statistic.

Possible editting could be standard_connectivity = self.connectivity / self.connectivity.sum(axis=1)[:, numpy.newaxis]?

Looking forward to your discussion!

The text was updated successfully, but these errors were encountered:

ljwolf · 2022-05-27T18:22:11Z

Yes, thanks! That does look like an issuse. I think we need to use self.connectivity.sum(axis=1, keepdims=True) to fix. Thanks! Will get this done ASAP.

ChaozhongLiu · 2022-05-27T18:31:26Z

Thanks! @ljwolf

jGaboardi · 2022-11-25T16:14:49Z

xref #91

ljwolf · 2023-05-24T22:12:19Z

Actually, this is correct as written thanks to the vagaries of the matrix API:

>>> from scipy import sparse
>>> import numpy
>>> numpy.random.seed(111211)
>>> connectivity = sparse.random(10,10,density=.4)
>>> connectivity.sum()
matrix([[2.50042315],
        [3.55327513],
        [0.79089601],
        [1.44499382],
        [1.63995059],
        [2.36368386],
        [1.9093926 ],
        [3.15104416],
        [2.07759805],
        [0.80082077]])

The documentation says that connectivity must be of type scipy.sparse.matrix, and is what comes from using w.sparse.

If you pass a numpy array describing connectivity that you've built yourself, this will be silently incorrect. But, we document the expected input type, and for that type, the code is correct.

I will add a fix that ensures this correction works for array types as well. Eventually, we need to move to scipy.sparse.csc_array(), but that will occur later.

ljwolf self-assigned this May 27, 2022

ljwolf added the bug label May 27, 2022

ljwolf linked a pull request May 24, 2023 that will close this issue

Bug fixes across Lee and swap to survival function for #243 #245

Merged

ljwolf added rough edge and removed bug labels May 25, 2023

sjsrey closed this as completed in #245 May 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential wrong row-standardization in lee.py #207

Potential wrong row-standardization in lee.py #207

ChaozhongLiu commented May 27, 2022

ljwolf commented May 27, 2022

ChaozhongLiu commented May 27, 2022

jGaboardi commented Nov 25, 2022

ljwolf commented May 24, 2023

Potential wrong row-standardization in lee.py #207

Potential wrong row-standardization in lee.py #207

Comments

ChaozhongLiu commented May 27, 2022

ljwolf commented May 27, 2022

ChaozhongLiu commented May 27, 2022

jGaboardi commented Nov 25, 2022

ljwolf commented May 24, 2023