Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'operands could not be broadcast together with shapes' when using 'pred_results.predictions' #152

Open
X-Fan-Jack opened this issue Apr 19, 2024 · 2 comments

Comments

@X-Fan-Jack
Copy link

X-Fan-Jack commented Apr 19, 2024

I follow the tutorial in (https://pysal.org/notebooks/model/mgwr/GWR_prediction_example.html)
using gaopandas and sample to split the test and train set.

gdf = data_geo.to_crs('EPSG:27700')
gdf_train = gdf.sample(frac=0.8, axis=0, random_state=RANDOM_SEED)
gdf_test = gdf[~gdf.index.isin(gdf_train.index)]

X_train = gdf_train.drop(['Y', 'geometry'], axis=1).values
y_train = gdf_train['Y'].values.reshape((-1,1))
u = gdf_train.geometry.x
v = gdf_train.geometry.y
coords_train = list(zip(u,v))
selector = Sel_BW(coords, y_train, X_train)
gwr_bw = selector.search()
print('GWR bandwidth =', gwr_bw)
model = GWR(coords_train, y_train, X_train, gwr_bw)
gwr_results = model.fit()

X_test = gdf_test.drop(['Y', 'geometry'], axis=1).values
y_test = gdf_test['Y'].values.reshape((-1,1))
u = gdf_test.geometry.x
v = gdf_test.geometry.y
coords_test = np.array(list(zip(u,v)))  # https://github.com/pysal/mgwr/issues/85
scale = gwr_results.scale
residuals = gwr_results.resid_response

pred_results = model.predict(coords_test, X_test, scale, residuals)

Currently, it works well. But when I want to print the prediction result.

pred_results.predictions

it shows the

Caution

ValueError: operands could not be broadcast together with shapes (201,106) (201,103)

How to fix it, I want to check the R2 of the predicted results.

@X-Fan-Jack
Copy link
Author

Maybe this can help to figure out what is wrong.
the total dataset has 1004 rows

I try to pass 105 independent variables, which means X_train shows (803, 105).
After I use the model = GWR(coords_train, y_train, X_train, gwr_bw), I use model.X.shape to check the independent variables, and it changes to (803, 103).

I don't know why it misses 2 columns, and I think that is why they can not match with 106. model.P.shape (201, 106)

@X-Fan-Jack
Copy link
Author

The data I use contains 3 columns with 0 values and they present some characteristics with other columns.

After I delete these columns, the code pred_results.predictions works well and it can return an array.

I assume that in the GWR, it will automatically delete some columns that only contain zero values. Is it correct? and will this lose some data features and lead to inaccurate results?

Thank the development team for providing us with the package!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant