Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Surface Interpolation Function [Fixes Issue45] #52

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 61 additions & 4 deletions membrane_curvature/surface.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ def derive_surface(atoms, n_cells_x, n_cells_y, max_width_x, max_width_y):

Returns
-------
z_coordinates: numpy.ndarray
z_coordinates: np.ndarray
Average z-coordinate values. Return Numpy array of floats of
shape `(n_cells_x, n_cells_y)`.

Expand All @@ -55,8 +55,8 @@ def get_z_surface(coordinates, n_x_bins=10, n_y_bins=10, x_range=(0, 100), y_ran

Parameters
----------
coordinates : numpy.ndarray
Coordinates of AtomGroup. Numpy array of shape=(n_atoms, 3).
coordinates : np.ndarray
Coordinates of AtomGroup. NumPy array of shape=(n_atoms, 3).
n_x_bins : int.
Number of bins in grid in the `x` dimension.
n_y_bins : int.
Expand All @@ -70,7 +70,7 @@ def get_z_surface(coordinates, n_x_bins=10, n_y_bins=10, x_range=(0, 100), y_ran
-------
z_surface: np.ndarray
Surface derived from set of coordinates in grid of `x_range, y_range` dimensions.
Returns Numpy array of floats of shape (`n_x_bins`, `n_y_bins`)
Returns NumPy array of floats of shape (`n_x_bins`, `n_y_bins`)

"""

Expand Down Expand Up @@ -131,3 +131,60 @@ def normalized_grid(grid_z_coordinates, grid_norm_unit):
z_normalized = grid_z_coordinates / grid_norm_unit

return z_normalized


def interpolation_by_array(array_surface):
"""
Interpolates values contained in `array_surface` over axis
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Over which axis?


Parameters
----------

array_surface: np.ndarray
Array of floats of shape (`n_x_bins`, `n_y_bins`)

Returns
-------
interpolated_array: np.ndarray
Returns interpolated array.
Array of shape (`n_x_bins` x `n_y_bins`,)

"""

# create mask for nans
mask_nans = ~np.isnan(array_surface)

# index of array_surface
index_array = np.arange(array_surface.shape[0])

# interpolate values in array
interpolated_array = np.interp(index_array,
np.flatnonzero(mask_nans),
array_surface[mask_nans])

return interpolated_array
Comment on lines +155 to +165
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite understanding this. mask_nans is 2D, with shape (n_x, n_y). index_array is 1D, with shape n_x. np.flatnonzero(mask_nans) and array_surface[mask_nans] will be 1D, of length between 0 to n_x * n_y. So I think that means that you are only ever operating on the first row of array_surface?

I guess my question is then, why is array_surface 2D? Why not just pass in one row at a time? In addition, as it is a 2D surface, why use a 1D interpolation function? Why not a 2D one?



def surface_interpolation(array_surface):
"""
Calculates interpolation of arrays.

Parameters
----------

array_surface: np.ndarray
Array of floats of shape (`n_x_bins`, `n_y_bins`)

Returns
-------

interpolated_surface: np.ndarray
Interpolated surface derived from set of coordinates
in grid of `x_range, y_range` dimensions.
Array of floats of shape (`n_x_bins`, `n_y_bins`)

"""

interpolated_surface = np.apply_along_axis(interpolation_by_array, axis=0, arr=array_surface)
ojeda-e marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see that you are applying interpolation_by_array by row, then. In that case I suggest amending the documentation of interpolation_by_array to specify that the input is 1-dimensional with length n_x, instead of n_x, n_y. Although I think doing 2D interpolation would be less arbitrary (choosing which axis is x and which is y is basically a random, right?)


return interpolated_surface
36 changes: 34 additions & 2 deletions membrane_curvature/tests/test_membrane_curvature.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,11 @@


import pytest
from membrane_curvature.surface import normalized_grid, derive_surface, get_z_surface
from membrane_curvature.surface import (interpolation_by_array, normalized_grid,
derive_surface, get_z_surface, surface_interpolation)
from membrane_curvature.curvature import mean_curvature, gaussian_curvature
import numpy as np
from numpy.testing import assert_almost_equal
from numpy.testing import assert_almost_equal, assert_allclose
import MDAnalysis as mda
from membrane_curvature.tests.datafiles import (GRO_PO4_SMALL, XTC_PO4_SMALL)
from membrane_curvature.base import MembraneCurvature
Expand Down Expand Up @@ -209,6 +210,37 @@ def test_get_z_surface(x_bin, y_bin, x_range, y_range, expected_surface):
assert_almost_equal(surface, expected_surface)


@pytest.mark.parametrize('dummy_surface, expected_interpolated_surface', [
# array 3x3 with all 150 and one nan
(np.array(([150., 150., 150.],
[150., np.nan, 150.],
[150., 150., 150.])),
np.full((3, 3), 150.)),
# array 3x4 with all 150 and two nans
(np.array([[150., 150, 150., 150.],
[150., np.nan, np.nan, 150.],
[150., 150., 150., 150.]]),
np.full((3,4), 150.)),
# array 4x4 with all 150 and two nans
(np.array([[150., 150, 150., 150.],
[150., np.nan, np.nan, 150.],
[150., 130., 140., 150.],
[150., 150., 150., 150.]]),
np.array([[150., 150, 150., 150.],
[150., 140., 145., 150.],
[150., 130., 140., 150.],
[150., 150., 150., 150.]])),
# array 3x3 with lots of nans
(np.array([[np.nan, np.nan, 150.],
[150, np.nan, 150.],
[np.nan, 150., np.nan]]),
np.full((3, 3), 150.))
])
def test_surface_interpolation(dummy_surface, expected_interpolated_surface):
surface = surface_interpolation(dummy_surface)
assert_almost_equal(surface, expected_interpolated_surface)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For arrays of floats, the docstring for assert_almost_equal suggests using i.e., assert_allclose these days for better consistency.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I remember @orbeckst mentioned that MDAnalysis typically uses assert_almost_equal. Maybe we can have his opinion on what would be best here?

Copy link
Member

@orbeckst orbeckst Aug 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MDAnalysis uses assert_almost_equal in preference to assert_array_almost_equal. assert_allclose might be more recent than when we started using numpy tests so MDAnalysis might just be outdated and should be switching eventually. Follow @tylerjereddy 's advice :-).



class TestMembraneCurvature(object):
@pytest.fixture()
def universe(self):
Expand Down